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PREFACE TO 
THE RUSSIAN EDITION 


Mathematics, which originated in antiquity in the needs of daily life, 
has developed into an immense System of widely varied disciplines. Like 
the other sciences, it reflects the laws of the material world around us 
and serves as a powerful instrument for our knowledge and mastery of 
nature. But the high level of abstraction peculiar to mathematics means 
that its newer branches are relatively inaccessible to nonspecialists. This 
abstract character of mathematics gave birth even in antiquity to 
idealistic notions about its independence of the material world. 

In preparing the présent volume, the authors hâve kept in mind the 
goal of acquainting a sufficiently wide circle of the Soviet intelligentsia 
with the various mathematical disciplines, their content and methods, 
the foundations on which they are based, and the paths along which 
they hâve developed. 

As a minimum of necessary mathematical knowledge on the part of 
the reader, we hâve assumed only secondary-school mathematics, but 
the volumes differ from one another with respect to the accessibility of 
the material contained in them. Readers wishing to acquaint themselves 
for the first time with the éléments of higher mathematics may profitably 
read the first few chapters, but for a complété understanding of the 
subséquent parts it will be necessary to hâve made some study of cor- 
responding textbooks. The book as a whole will be understood in a 
fundamental way only by readers who already hâve some acquaintance 
with the applications of mathematical analysis; that is to say, with the 
differential and intégral calculus. For such readers, namely teachers of 
mathematics and instructors in engineering and the natural sciences, it 
will be particularly important to read those chapters which introduce 
the newer branches of mathematics. 
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vi PREFACE TO THE RUSS1AN EDITION 

Naturally it has not been possible, within the limits of one book, to ex- 
haust ail the riches of even the most fundamental results of mathematical 
research; a certain freedom in the choice of material has been inévitable 
here. But along general lines, the présent book will give an idea of the 
présent State of mathematics, its origins, and its probable future develop¬ 
ment. For this reason the book is also intended to some extern for persons 
already acquainted with most of the factual material in it. It may perhaps 
help to remove a certain narrowness of outlook occasionally to be 
found in some of our younger mathematicians. 

The separate chapters of the book are written by various authors, 
whose names are given in the Contents. But as a whole the book is the 
resuit of collaboration. Its general plan, the choice of material, the suc¬ 
cessive versions of individual chapters, were ail submitted to general 
discussion, and improvements were made on the basis of a lively exchange 
of opinions. Mathematicians from several cities in the Soviet Union 
were given an opportunity, in the form of organized discussion, to make 
many valuable remarks concerning the original version of the text. Their 
opinions and suggestions were taken into account by the authors. 

The authors of some of the chapters also took a direct share in pre- 
paring the final version of other chapters: The introductory part of 
Chapter II was written essentially by B. N. Delone, while D. K. Faddeev 
played an active rôle in the préparation of Chapter IV and Chapter XX. 

A share in the work was also taken by several persons other than the 
authors of the individual chapters: $4 of Chapter XIV was written by 
L. V. Kantorovic, $6 of Chapter VI by O. A. Ladyzenskaja, §5 of 
Chapter 10 by A. G. Postnikov; work was done on the text of Chapter V 
by O. A. Oleïnik and on Chapter XI by Ju. V. Prohorov. 

Certain sections of Chapters I, II,VII, and XVII were written by 
V. A. Zalgaller. The editing of the final text was done by V. A. Zalgaller 
and V. S. Videnskiï with the coopération of T. V. Rogozkinaja and 
A. P. Leonovaja. 

The greater part of the illustrations were prepared by E. P. Sen'kin. 
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FOREWORD BY THE 
EDITOR OF THE TRANSLATION 


Mathematics, in view of its abstractness, offers greater difficulty to the 
expositor than any other science. Yet its rapidly increasing rôle in modem 
life créâtes both a need and a desire for good exposition. 

In recent years many popular books about mathematics hâve appeared 
in the English language, and some of them hâve enjoyed an immense 
sale. But for the most part they hâve contained little serious mathematical 
instruction, and many of them hâve neglected the twentieth century, the 
undisputed "golden âge” of mathematics. Although they are admirable 
in many other ways, they hâve not yet undertaken the ultimate task of 
mathematical exposition, namely the large-scale organization of modem 
mathematics in such a way that the reader is constantly delighted by the 
obvious economizing of his own time and effort. Anyone who reads 
through some of the chapters in the présent book will realize how weli 
this task has been carried out by the Soviet authors, in the systematic 
collaboration they hâve described in their préfacé. 

Such a book, written for “a wide circle of the intelligentsia,” must also 
discuss the general cultural importance of mathematics and its continuous 
development from the earliest beginnings of history down to the présent 
day. To form an opinion of the book from this point of view the reader 
need only glance through the first chapter in Part 1 and the introduction 
to certain other chapters; for example, Analysis, or Analytic Geometry. 

In translating the passages on the history and cultural significance of 
mathematical ideas, the translators hâve naturally been aware of even 
greater difficulties than are usually associated with the translation of 
scientific texts. As organizer of the group, 1 express my profound grati¬ 
tude to the other two translators, Tamas Bartha and Kurt Hirsch, for 
their skillful coopération. 
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The présent translation, which was originally published by the Ameri¬ 
can Mathematical Society, will now enjoy a more general distribution in 
its new format. In thus making the book more widely available the 
Society has been influenced by various expressions of opinion from 
American mathematicians. For example, . . the book will contribute 
materially to a better understanding by the public of what mathematicians 
are up to. . . . It will be useful to many mathematicians, physicists and 
chemists, as well as to laymen. . . . Whether a physicist wishes to know 
what a Lie algebra is and how it is related to a Lie group, or an under- 
graduate would like to begin the study of homology, or a crystallographer 
is interested in Fedorov groups, or an engineer in probability, or any 
scientist in computing machines, he will find here a connected, lucid 
account.” 

In its first édition this translation has been widely read by mathemati¬ 
cians and students of mathematics. We now look forward to its wider 
usefulness in the general English-speaking world. 

August, 1964 

S. H. Gould 
Editor of Translations 
American Mathematical Society 
Providence, Rhode Island 
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PART 3 



CHAPTER 


VI 


PARTIAL 

DIFFERENTIAL EQUATIONS 


§1. Introduction 

In the study of the phenomena of nature, partial differential 
équations are encountered just as often as ordinary ones. As a rule this 
happens in cases where an event is described by a function of several 
variables. From the study of nature there arose that class of partial dif¬ 
ferential équations that is at the présent time the most thoroughly investi- 
gated and probably the most important in the general structure of human 
knowledge, namely the équations of mathematical physics. 

Let us first consider oscillations in any kind of medium. In such oscil¬ 
lations every point of the medium, occupying in equilibrium the position 
( x , y, z), will at time t be displaced along a vector u(x, y, z, t), depending 
on the initial position of the point ( x , y, z ) and on the time t. In this case 
the process in question will be described by a vector field. But it is easy 
to see that knowledge of this vector field, namely the field of displacements 
of points of the medium, is not sufficient in itself for a full description of 
the oscillation. It is also necessary to know, for example, the density 
p(x, y, z, t ) at each point of the medium, the température T(x, y, z, t ), 
and the internai stress, i.e., the forces exerted on an arbitrarily chosen 
volume of the body by the entire remaining part of it. 

Physical events and processes occuring in space and time always consist 
of the changes, during the passage of time, of certain physical magnitudes 
related to the points of the space. As we saw in Chapter II these quantities 
can be described by functions with four independent variables, x, y, z, 
and /, where x, y, and z are the coordinates of a point of the space, and 
and t is the time. 
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VI. PARTIAL DIFFERENTIAL EQUATIONS 


Physical quantifies may be of different kinds. Some are completely 
characterized by their numerical values, e.g., température, density, and 
the like, and are called scalars. Others hâve direction and are therefore 
vector quantities: velocity, accélération, the strength of an electric field, 
etc. Vector quantities may be expressed not only by the length of the 
vector and its direction but also by its "components” if we décomposé 
it into the sum of three mutually perpendicular vectors, for example 
parallel to the coordinate axes. 

In mathematical physics a scalar quantity or a scalar field is presented 
by one function of four independent variables, whereas a vector quantity 
defined on the whole space or, as it is called, a vector field is described by 
three functions of these variables. We can write such a quantity either in 
the form 

u(x, y, z, /), 


where the bold face type indicates the « is a vector, or in the form of three 
functions 

u x (x, y, z, /), u y (x, y, z, /), u,{x, y, z, r), 

where u x , u y , and u z dénoté the projections of the vector on the coordinate 
axes. 

In addition to vector and scalar quantities, still more complicated entities 
occur in physics, for example the State of stress of a body at a given point. 
Such quantities are called tensors; after a fixed choice of coordinate axes, 
they may be characterized everywhere by a set of functions of the same 
four independent variables. 

In this manner, the description of widely different kinds of physical 
phenomena is usually given by means of several functions of several 
variables. Of course, such a description cannot be absolutely exact. 

For example, when we describe the density of a medium by means of 
one function of our independent variables, we ignore the fact that at a 
given point we cannot hâve any density whatsoever. The bodies we are 
investigating hâve a molecular structure, and the molécules are not 
contiguous but occur at finite distances from one another. The distances 
between molécules are for the most part considerably larger than the 
dimensions of the molécules themselves. Thus the density in question is 
the ratio of the mass contained in some small, but not extremely small, 
volume to this volume itself. The density at a point we usually think of as 
the limit of such ratios for decreasing volumes. A still greater simplification 
and idealization is introduced in the concept of the température of a 
medium. The heat in a body is due to the random motion of its molécules. 



§2. EQUATIONS O F MATHEMATICAL PHYS ICS 


5 


The energy of the molécules differs, but if we consider a volume containing 
a large collection of molécules, then the average energy of their random 
motions will define what is called température. 

Similarly, when we speak of the pressure of a gas or a liquid on the wall 
of a container, we should not think of the pressure as though a particle 
of the liquid or gas were actually pressing against the wall of the container. 
In fact, these particles, in their random motion, hit the wall of the container 
and bounce off it. So what we describe as pressure against the wall is 
actually made up of a very large number of impulses received by a section 
of the wall that is small from an everyday point of view but extremely 
large in comparison with the distances between the molécules of the liquid 
or gas. It would be easy to give dozens of examples of a similar nature. 
The majority of the quantities studied in physics hâve exactly the same 
character. Mathematical physics deals with idealized quantities, abstracting 
them from the concrète properties of the corresponding physical entities 
and considering only the average values of these quantities. 

Such an idealization may appear somewhat coarse but, as we will see, 
it is very useful, since it enables us to make an excellent analysis of many 
complicated matters, in which we consider only the essential éléments and 
omit those features which are secondary from our point of view. 

The object of mathematical physics is to study the relations existing 
among these idealized éléments, these relations being described by sets of 
functions of several independent variables. 


§2. The Simplest Equations of Mathematical Physics 

The elementary connections and relations among physical quantities are 
expressed by the laws of mechanics and physics. Although these relations 
are extremely varied in character, they give rise to more complicated ones, 
which are derived from them by mathematical argument and are even 
more varied. The laws of mechanics and physics may be written in mathe¬ 
matical language in the form of partial differential équations, or perhaps 
intégral équations, relating unknown functions to one another. To 
understand what is meant here, let us consider some examples of the 
équations of mathematical physics. 

Equations of conservation of mass and of beat energy. Let us express 
in mathematical form the basic physical laws goveming the motions of a 
medium. 

1. First of ail we express the law of conservation of the matter contained 
in any volume Q which we mentally mark off in a space and keep fixed. 
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For this purpose we must calculate the mass of the matter contained in 
this volume. The mass M n (t) is expressed by the intégral 

Mail) = /// M*. y> 2. 0 dx dy dz. 
a 

This mass will not, of course, be constant; in an oscillatory process the 
density at each point will be changing in view of the fact that the particles 
of matter in their oscillations will at one time enter this volume and at 
another leave it. The rate of change of the mass can be found by différentia¬ 
tion with respect to time and is given by the intégral 

tMJJÏ***- 

» 

This rate of change of the mass contained in the volume may also be 
calculated in another way. We may express the amount of matter which 
passes through the surface S, bounding our volume Q, at each second of 
time, where the matter leaving Q must be taken with a minus sign. To this 
end we consider an element ds of the surface S sufficiently small ihat it 
may be assumed to be plane and hâve the same displacement for ail its 
points. We will follow the displacement of points on this segment of the 

surface during the interval of time from 
l to t + dt. First of ail we compute the vector 

du 

V = T,' 

which represents the velocity of each particle. 
In the time dt the particles on ds move along 
the vector v dt, and take up a position ds t , 
while the position ds will now be occupied by 
the particles which were formerly at the 
position ds t (figure 1). So during this time 
the column of matter leaving the volume Si 
will be that which was earlier contained 
between ds t and ds t . The altitude of this 
small column is equal to v dt cos (n, v), where 
n dénotés the exterior normal to the surface; 
the volume of the small column will thus be 
equal to 

v cos (n, v) ds dt, 
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and the mass equal to 

pv cos (n, v) ds dt. 

Adding together ail these small pièces, we get for the amount of matter 
leaving the volume during the time dt the expression 

| J pv cos (n, v ) ds dt. 

's 

At those points where the velocity is directed toward the interior of Q the 
sign of the cosine will be négative, which means that in this intégral the 
matter entering Q is taken with a minus sign. The product of the velocity 
of motion of the medium with its density is called its flux. The flux vector 
of the mass is q = pv. 

In order to find the rate of flow of matter out of the volume Q it is 
sufficient to divide this expression by dt, so that for the rate of flow we hâve 


JJ pv„ ds = jjdnds, 

. s s 

where 

v„ = v cos («, v), q„ = q cos («, q). 

The normal component of the vector v may be replaced by its expression 
in terms of the components of the vectors v and n along the coordinate 
axes. From analytic geometry we know that 

v„ = v cos (fl, V ) = v x cos (fl, x) - 1 - v, cos (fl, y) + v, cos (n, z), 

hence we can rewrite the expression for the rate of flow in the form 


JJ* * cos (fl, x) + Vy cos (fl, y) + v, cos (n, z)) ds. 

s 

From the law of conservation of matter, these two methods of computing 
the change in the amount of matter must give the same resuit, since ail 
change in the mass included in Q can occur only as a resuit of the entering 
or leaving of mass through the surface S. 

Hence, equating the rate of change of the amount of matter contained 
in the volume with the rate of flow of matter into the volume, we get 


fff^dxdydz 

n 

= - JJ [pv x cos (fl, x) + pPy cos (n, y>) -f pv, cos (a, z)] ds 

s 

= — Jj [q T cos («, x) + <7, cos (n, y) + q, cos («, z)] ds. 
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This intégral relation, as we hâve said, is true for any volume Q. It is called 
“the équation of continuity.” 

The intégral occuring on the right side of the last équation may be 
transformed into a volume intégral by using Ostrogradskil’s formula. 
This formula, derived in Chapter II gives 


|J ( pv x cos (n, x) + pi y cos (n, y) + pv t cos (n, z)) ds 


Hence it follows that 






So we get the following resuit; the intégral of the function 

fÎP , ^P»») . fy*’») , f . r ^P , % , f^v , ty, 

dt^dx'dy dz dt ^ dx ^ dy + dz 

over any volume Q is equal to zéro. But this is possible only if the function 
is identically zéro. We thus obtain the équation of continuity in differential 
form 

. 8(pv x ) d(pi\) d(pv,) 


fr + 


dx 


+ 


dy 


+ 


dz 


= 0 . 


(1) 


Equation (1) is a typical example of the formulation of a physical law in 
the language of partial differential équations. 

2. Let us consider another such problem, namely the problem of heat 
conduction. 

In any medium whose particles are in motion on account of heat, the 
heat flows from some points to others. This flow of heat will occur through 
every element of surface ds lying in the given medium. It can be shown that 
the process may be described numerically by a single vector quantity, the 
heat-conduction vector, which we dénoté by t. Then the amount of heat 
fiowing per second through an element of area ds will be expressed by 
t„ ds, in the same way as q„ ds earlier expressed the amount of material 
passing per second through an area ds. In place of the flux of liquid 
q = pv we hâve the heat flow vector t. 

In the same way as we obtained the équation of continuity, which for 
the motion of a liquid expresses the law of conservation of mass, we may 
obtain a new partial differential équation expressing the law of conserva¬ 
tion of energy, as follows. 
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The volume density of heat energy Q at a given point may be expressed 
by the formula 

Q = CT, 

where C is the heat capacity and T is the température. 

Here it is easy to establish the équation 


dT dr B d r, dr, 
dt dx ^ dy ^ dz 


= 0 . 


( 2 ) 


The dérivation of this équation is identical with the dérivation of the 
équation of continuity, if we replace “density” by “density of heat energy” 
and flow of mass by flow of heat. Here we hâve assumed that the heat 
energy in the medium never increases. But if there is a source of heat 
présent in the medium, équation (2) for the balance of heat energy must 
be modified. If q is the productivity density of the source, that is the amount 
of heat energy produced per unit of volume in one second, then the 
équation of conservation of heat energy has the following more compli- 
cated form: 



gT « , dr v , g T, = 

dx dy dz 


q- 


( 3 ) 


3. Still another équation of the same type as the équation of continuity 
may be derived by differentiating équation (1) with respect to time. Let us 
do this for the équation of small oscillations of a gas near a position of 
equilibrium. We will assume that for such oscillations changes of the 
density are not great and the quantities dp/dx, dp/dy, dp/dz, and dp/dt 
are sufficiently small that their products with v x , v „, and v, may be 
ignored. Then 


dp 

dt 



+ 


dy 


+ 



Differentiating this équation with respect to time and ignoring the products 
of dp/dt with dvjdx, dvjdy, and dvjdz, we obtain 



Equation of motion. 

1. An important example of the expression of a physical law by a 
differential équation occurs in the équations of equilibrium or of motion 
of a medium. Let the medium consist of material particles, moving with 
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various velocities. As in the first example, we mentally mark off in space a 
volume Q, bounded by the surface S and filled with particles of matter of 
the medium, and write Newton’s third law for the particles in this volume. 
This law States that for every motion of the medium the rate of change of 
momentum, summed up for ail particles, in the volume is equal to the sum 
of ail the forces acting on the volume. The momentum, as is known from 
mechanics, is represented by the vector quantity 

P = pvdQ ' 
a 


The particles occupying in small volume dQ with density p will, after 
lime At, fill a new volume dQ' with density p', although the mass will be 
unchanged 

p' dQ' = p dQ. 

If velocity v changes during this time to a new value v', i.e., by the 
amount Av = v' — v, the corresponding change of momentum will be 

p'v' dQ' — pv dQ = pv dQ — pv dQ = p Av dQ, 


or in the unit of time: 


P^dQ * P j t dQ. 


Adding over ail particles in the volume Q, we find that the rate of 
change of momentum is equal to 


or, in other words 



O O O 


(Here the dérivatives dvjdt, dvjdt, and dv,/dl dénoté the rate of change 
of the components of v not at a given point of the space but for a given 
particle. This is what is meant by the notation d/di instead of d/dt. As is 
well known, d/dt = d/dt + vjd/dx) + v,(d/dy) + v z (d/dz).) 

The forces acting on the volume may be of two kinds: volume forces 
acting on every particle of the body, and surface forces or stresses on the 
surface S bounding the volume. The former are long-range forces, while 
the latter are short-range. 

To illustrate these remarks, let us assume that the medium under 
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considération is a fluid. The surface forces acting on an element of the 
surface ds will in this case hâve the value p ds, where p is the pressure on 
the fluid, and will be exerted in a direction opposite to that of the exterior 
normal. 

If we dénoté the unit vector in the direction of the normal to the surface 
5 by /», then the forces acting on the section ds will be equal to 

—pn ds. 

If we let F dénoté the vector of the external forces acting on a unit of 
volume, our équation takes the form 

-///'■“-JW 

O os 


This is the équation of motion in intégral form. Like the équation of 
continuity, this équation also may be transformed into differential form. 
We obtain the System: 


A ,^_r n ±y .ty-F F 

p dt + dx Fz,f> di + dy F ’ ’ p dt + dz ' Fz ‘ 


(5) 


This system is the differential form of Newton’s third law. 


2. Another characteristic example of the application of the laws of 
mechanics in differential form is the équation of a vibrating string. A string 
is a long, very slender body of elastic material that is flexible because of 
its extreme thinness, and is usually tightly stretched. If we imagine the 
string divided at any point x into two parts, then on each of the parts 
there is exerted a force equal to the tension in the direction of the tangent 
to the curve of the string. 

Let us examine a short segment of the string. We will dénoté by u(x, t) 
the displacement of a point of the string from its position of equilibrium. 
We assume that the oscillation of the string occurs in one plane and consists 
of displacements perpendicular 
to the axis Ox, and we represent 
the displacement u(x , t) graphi- 
cally at some instant of time 
(figure 2). We will investigate 
the behavior of the segment of 
the string between the points 
x, and x 2 . At these points there 
are two forces acting, which are 
equal to the tension T in the 
direction of the corresponding tangent to u(x, t). 



Fig. 2. 
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If the segment is curved, the resolvent of these two forces will not be 
equal to zéro. This resolvent, from the laws of mechanics, must be equal 
to the rate of change of momentum of the segment. 

Let the mass contained in each centimeter of length of the string be 
equal to p. Then the rate of change of momentum will be 


P 



d*u 
dt* 


dx. 


If the angle between the tangent to the string and the axis Ox is denoted 
by <f>, we will hâve 

/•** d“U 

T sin <f> 2 -T sin <£, = J p dx. 


This is the usual équation expressing the third law of mechanics in intégral 
form. It is easy to transform it into differential form. We hâve obviously 


Pu 
p dt* 


ftiTsin*). 


From well-known theorems of differential calculus, it is easy to relate 
T sin <f> to the unknown function u. We get 


. OU . , uxn <p 

tan <i> = — , sin 6 = — ,- == 

dx VI + tan* <f> Vl + (àu/dx ) 1 

and under the assumption that ( du/dx )* is small, we hâve 


Then 


S,n ^0-x' 


_ ë*u d*u 
T dx* ' p dt* ' 


( 6 ) 


This last équation is the équation of the vibrating string in differential 
form. 


Basic forms of équations of mathematical physics. As mentioned 
previously, the various partial differential équations describing physical 
phenomena usually form a System of équations in several unknown 
variables. But in the great majority of cases it is possible to replace this 
System by one équation, as may easily be shown by very simple examples. 

For instance, let us turn to the équations of motion considered in the 
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preceding paragraph. It is required to solve these équations along with 
the équation ofcontinuity. The actual methods of solution we will consider 
somewhat later. 

I. We begin with the équation for steady flow of an idealized fluid. 

Ail possible motions of a fluid can be divided into rotational and 
irrotational, the latter also being called poienlial. Although irrotational 
motions are only spécial cases of motion and, generally speaking, the 
motion of a liquid or a gas is always more or less rotational, nevertheless 
expérience shows that in many cases the motion is irrotational to a high 
degree of exactness. Moreover, it may be shown from theoretical con¬ 
sidérations that in a fluid with viscosity equal to zéro a motion which is 
initially irrotational will remain so. 

For a potential motion of a fluid, there exists a scalar function 
U(x, y, z , r), called the velocily potential , such that the velocity vector v 
is expressed in terms of this functions by the formulas 

du du du 

r * ~ dx ' Vy - dy ' v ‘ ~ dz ■ 

In ail the cases we hâve studied up to now, we hâve had to deal with 
Systems of four équations in four unknown functions or, in other words, 
with one scalar and one vector équation, containing one unknown scalar 
function and one unknown vector field. Usually these équations may be 
combined into one équation with one unknown function, but this équation 
will be of the second order. Let us do this, beginning with the simplest 


For potential motion of an incompressible fluid, for which dp/dt = 0, 
we hâve two Systems of équations: the équation of continuity 



and the équations of potential motion 




dU dU dU 

dx ' Vy ~ dy' V ‘~ dz ‘ 


Substituting in the first équation the values of the velocity as given in the 
second we hâve 


d*U d*U d*U 
dx* + dy* + dz* 


(7) 


2. The vector field of “heat flow” can also be expressed, by means of 
diflerential équations, in terms of one scalar quantity, the température. 
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lt is well known that heat “flows” in the direction from a hot body to a 
cold one. Thus the vector of the flow of heat lies in the direction opposite 
to that of the so-called temperature-gradient vector. lt is also natural to 
assume, as.is justified by expérience, that to a first approximation the 
length of this vector is directly proportional to the température gradient. 
The components of the température gradient are 

dT BT dT 
dx ’ 0v ’ d: ' 

Taking the coefficient of proportionality to be k, we get three équations 





These are to be solved, together with the équation for the conservation of 
heat energy 


dT 3t x ôt ¥ 0t, 
dt dx dy dz 


= <7- 


Replacing t, . r, , and t, by their values in terms of T, we get 
C 


■'S- = ki 


dt 


d*T 
\ dx * 


d*T , a*r » 

W + te 5- ) + q - 


( 8 ) 


3. Finally. for small vibrations in a gaseous medium, for example the 
vibrations of Sound, the équation 

à 1 p d / dv x \ d / di\ \ , 'ci dv, \ . 
dt* p dx\ dt ^ p dv ( dt ) + p dz\ dt ) _ ° 


and the équations of dynamics (5), give 


di’x 

~di 


dp _ 


df, , <>P 


de, dp 


+ Tx ~ Fx ’ p + Tv ~ F * ’ p HT + h~, ~ F ‘■ • 


dt 1 dy 


dt 


dz 


and, assuming the absence of external forces ( F x = F, = F, = 0) we get 



(to obtain this équation it is sufficient to substitute the expression for the 
accélérations into the équation of continuity and to eliminate the density 
p by using the Boyle-Mariotte law: p — a*p). 

Equations (7), (8), and (9) are typical for many problems of mathe- 
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matical physics in addition to the ones considered here. The fact that they 
hâve been investigated in detail enables us to gain an understanding of 
many physical situations. 


§3. Initial-Value and Boundary-Value Problems; 

Uniqueness of a Solution 

With partial differential équations as with ordinary ones, it is the case, 
with rare exceptions, that every équation has infinitely many particular 
solutions. Thus to solve a concrète physical problem, i.e., to find an 
unknown function satisfying some équation, we must know how to choose 
the required solution from an infinité set of solutions. For this purpose 
it is usually necessary to know not only the équation itself but a certain 
number of supplementary conditions. As we saw previously, partial 
differential équations are the expression of elementary laws of mechanics 
or physics, referring to small particles situated in a medium. But it is not 
enough to know only the laws of mechanics, if we wish to predict the 
course of some process. For example, to predict the motion of the heavenly 
bodies, as is done in astronomy, we must know not only the general for¬ 
mulation of Newton’s laws but also, assuming that the masses of these 
bodies are known, we must know the initial State of the System, i.e., the 
position of the bodies and their velocities at some initial instant of time. 
Supplementary conditions of this kind are always encountered in solving 
the problems of mathematical physics. 

Thus. the problems of mathematical physics consist of finding solutions 
of partial differential équations that satisfy certain supplementary condi¬ 
tions. 

The équations (7), (8), (9) differ in structure among themselves. 
Correspondingly different are the physical problems that may be solved 
by means of these équations. 

The Laplace and Poisson équations; harmonie functions and uniqueness 
of solution of boundary-value problems for them. Let us analyze these 
problems a little more in detail. We begin with the Laplace and Poisson 
équations. The Poisson équation is* 

Au = —4np, 

where p is usually the density. In particular, p may vanish. For p = 0 
we get the Laplace équation 

Au = 0. 

* The symbol Au is an abbreviation for the expression Pu/dx* + 8'u/dy t + tPu/dz 1 
and is called the Laptacian of the function u. 
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It is not difficult to see that the différence between any two particular 
solutions w, and w 2 of the Poisson équation is a function satisfying the 
Laplace équation, or in other words is a harmonie function. The entire 
manifold of solutions of the Poisson équation is thus reduced to the mani- 
fold of harmonie functions. 

If we hâve been able to construct even one particular solution u 0 of the 
Poisson équation, and if we define a new unknown function w by 


u = u 0 + w, 

we see that w must satisfy the Laplace équation; and in exactly the same 
way, we détermine the corresponding boundary conditions for w. Thus it 
is particularly important to investigate boundary value problems for the 
Laplace équation. 

As is most often the case with mathematical problems, the proper 
statement of the problem for an équation of mathematical physics is 
immediately suggested by the practical situation. The supplementary 
conditions arising in the solution of the Laplace équation corne from the 
physical statement of the problem. 

Let us consider, for example, the establishment of a steady température 
in a medium, i.e., the propagation of heat in a medium where the sources 
of heat are constant and are situated either inside or outside the medium. 
Under these conditions, with the passage of time the température attained 
at any point of the medium will be independent of the time. Thus to find 
the température T at each point, we must find that solution of the équation 




where q is the density of the sources of heat distribution, which is indepen¬ 
dent of t. We get 

AT+ <7 = 0. 

Thus the température in our medium satisfies the Poisson équation. If 
the density of heat sources q is zéro, then the Poisson équation becomes 
the Laplace équation. 

In order to find the température inside the medium, it is necessary, 
from simple physical considérations, to know also what happens on the 
boundary of the medium. 

Obviously the physical laws previously considered for interior points 
of a body call for quite another formulation at boundary points. 

In the problem of establishing the steady-state température, we can 
prescribe either the distribution of température on the boundary, or the 
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rate of flow of heat through a unit area of the surface, or finally, a law 
connecting the température with the flow of heat. 

Considering the température in a volume Q, bounded by the surface S, 
we can write these three conditions as: 

T\ s = <kQl (10) 

or 

O») 

or finally, in the most general case 


dT 
a dn 


+ PT\ s =x(Q\ 


dû') 


where Q dénotés an arbitrary point of the surface S. Conditions of the 
form (10) are called boundary conditions. Investigation of the Laplace or 
Poisson équation under boundary conditions of one of these types will 
show that as a rule the solution is uniquely determined. 

Thus, in our search for a solution of the Laplace or Poisson équation it 
will usually be necessary and sufficient to be given one arbitrary function 
on the boundary of the domain.* Let us examine the Laplace équation a 
little more in detail. We will show that a harmonie function u, i.e., a 
function satisfying the Laplace équation, is completely determined if we 
know its values on the boundary of the domain. 

First of ail we establish the fact that a harmonie function cannot take 
on values inside the domain that are larger than the largest value on the 
boundary. More precisely, we show that the absolute maximum, as well 
as the absolute minimum of a harmonie function are attained on the 
boundary of the domain. 

From this it will follow at once that if a harmonie function has a 
constant value on the boundary of a domain Q, then in the interior of this 
domain it will also be equal to this constant. For if the maximum and 
minimum value of a function are both the same constant, then the function 
will be everywhere equal to this constant. 

We now establish the fact that the absolute maximum and minimum of 
a harmonie function cannot occur inside the domain. First of ail, we note 
that if the Laplacian Au of the function u(x, y, z) is positive for the whole 
domain, then this function cannot hâve a maximum inside the domain, 
and if it is négative, then the function cannot hâve a minimum inside the 


* The words “arbitrary function" here and in what follows mean that no spécial 
conditions, other than certain requirements of regularity, are imposed on the functions. 
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domain. For at a point where the function u attains its maximum it must 
hâve a maximum as a function of each variable separately for fixed values 
of the other variables. Thus it follows that every partial dérivative of 
second order with respect to each variable must be nonpositive. This 
means that their sum will be nonpositive, whereas the Laplacian is positive, 
which is impossible. Similarly it may be shown that if the function has a 
minimum at some interior point, then its Laplacian cannot be négative 
at this point. This means that if the Laplacian is négative everywhere in 
the domain, then the function cannot hâve a minimum in this domain. 

If a function is harmonie, it may always be changed by an arbitrarily 
small amount in such a way that it will hâve a positive or négative 
Laplacian; to this end it is sufficient to add to it the quantity 

± V* = ± ifix* + ÿ* + z*), 

where 77 is an arbitrarily small constant: 

The addition of a sufficiently small quantity cannot change the property 
that the function has an absolute maximum or absolute minimum with 
the domain. If a harmonie function were to hâve a maximum inside the 
domain, then by adding + 1 y* to it, we would get a function with a positive 
Laplacian which, as was shown above, could not hâve a maximum inside 
the domain. This means that a harmonie function cannot hâve an absolute 
maximum inside the domain. Similarly, it can be shown that a harmonie 
function cannot hâve an absolute minimum inside the domain. 

This theorem has an important corollary. Two harmonie functions that 
agréé on the boundary of a domain must agréé everywhere inside the 
domain. For then the différence of these functions (which itself will be a 
harmonie function) vanishes on the boundary of the domain and thus is 
everywhere equal to zéro in the interior of the domain. 

So we see that the values of a harmonie function on the boundary 
completely détermine the function. It may be shown (although we cannot 
give the details here) that for arbitrarily preassigned values on the 
boundary one can always find a harmonie function that assumes these 
values. 

It is somewhat more complicated to prove that the steady-state 
température established in a body is completely determined, if we know 
the rate of flow of heat through each element of the surface of the body 
or a law connecting the flow of heat with the température. We will return 
to some aspects of this question when we discuss methods of solving 
the problems of mathematical physics. 


The boundary-value problem for the heat équation. A completely dif¬ 
ferent situation occurs in the problem of the heat équation in the non- 
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stationary case, lt is physically clear that the values of the température on 
the boundary or of the rate of the flow of heat through the boundary are 
not sufficient in themselves to define a unique solution of the problem. 
But if in addition we know the température distribution at some initial 
instant of time, then the problem is uniquely determined. Thus to déter¬ 
mine the solution of the équation of heat conduction (8) it is usually 
necessary and sufficient to assign one arbitrary function T t (x, y, z ) 
describing the initial distribution of température and also one arbitrary 
function on the boundary of the domain. As before, this may be either 
the température on the surface of the body, or the rate of heat flow 
through each element of the surface, or a law connecting the flow of 
heat with the température. 

In this manner, the problem may be stated as follows. We seek a solution 
of équation (8) under the condition 

T\i-o = T 0 (x,y, z) (11) 

and one of three following conditions 


T\s = <HQ), 


( 12 ) 


dT 

drt 


's 


= >KQ), 


<x 


<rr 

dn 


+ PT\ s =x(Q), . 

s 


( 12 ') 

( 12 ') 


where Q is any point of the surface 5. 

Condition (11) is called an initial condition, while conditions (12) are 
boundary conditions. 

We will not prove in detail that every such problem has a unique 
solution but will establish this fact only for the first of these problems; 
moreover, we will consider only the case where there are no heat sources 
in the interior of the medium. We show that the équation 


under the conditions 


at = 1-?L 
a * a t 


T Ii-O = T^x, y, z), 

t\ s = <KQ) 


can hâve only one solution. 
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The proof of this statement is very similar to the previous proof for the 
uniqueness of the solution of the Laplace équation. We show first of ail 
that if 


then the function T, as a function of four variables, x, y, z, and 
f(0 t t 0 ), assumes its minimum either on the boundary of the domain 
G or else inside G, but in the latter case necessarily at the initial instant 
of time, t = 0. 

For if not, then the minimum would be attained at some interior point. 
At this point ail the first dérivatives, including dT/dt, will then be equal 
to zéro, and if this minimum were to occur for t = t 0 , then dT/dt would 
be nonpositive. Also, at this point ail second dérivatives with respect to 
the variables x, y, and z will be nonnegative. Consequently AT — (l/a*) 
(dT/dt) will be nonnegative, which in our case is impossible. 

In exactly the same way we can establish that if AT — (l/a*) (dT/dt) > 0, 
then inside G for 0 < t < /„ there cannot exist a maximum for the 
function T. 

Finally, if AT — (l/a*) (dT/dt) = 0, then inside G for 0 < t < t k the 
function T cannot attain its absolute maximum nor its absolute minimum, 
since if the function T were to hâve, for example, such an absolute mini¬ 
mum, then by adding to it the term r/(t — t„) and considering the function 
T, = T + i)(t — t 0 ), we would not destroy the absolute maximum if 
i] were sufficiently small, and then AT X — (l/a*) (dTjdt) would be négative, 
which is impossible. 

In the same way we can also show the absence of an absolute maximum 
for T in the domain under considération. 

However, an absolute maximum, as well as an absolute minimum of 
température may occur either at the initial instant t = 0 or on the 
boundary S of the medium. If T = 0 both at the initial instant and on the 
boundary, then we hâve the identity T = 0 throughout the interior of the 
domain for ail t ^ t 0 . If any two température distributions F, and F 2 
hâve identical values for / = 0 and on the boundary then their différence 
F, — T 2 = T will satisfy the heat équation and will vanish for t — 0 
and on the boundary. This means that F, — F 2 will be everywhere equal 
to zéro, so that the two température distributions F, and F 2 will be 
everywhere identical. 

In the investigation given later of methods of solving the équations of 
mathematical physics we will see that the value of F for / = 0 and the 
right side of one of the équations (12) may be given arbitrarily, i.e., that 
the solution of such a problem will exist. 
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The energy of oscillations and the boundary-value problem for the 
équation of oscillation. We now consider the conditions under which the 
third of the basic differential équations has a unique solution, namely 
équation (9). 

For simplicity we will consider the équation for the vibrating string 
(Pu/dx* = (l/a*) (dfy/Sr*), which is very similar to équation (9), differing 
from it only in the number of space variables. On the right side of this 
équation there is the quantity dHifit* expressing the accélération of an 
arbitrary point of the string. The motion of any mechanical System for 
which the forces, and consequently the accélérations, are expressed by 
the coordinates of the moving bodies, is completely determined if we are 
given the initial positions and velocities of ail the points of the System. 
Thus for the équation of the vibrating string, it is natural to assign the 
positions and velocities of ail points at the initial instant. 


u I f—o = "oW 


du 

dt i_o 


u,(x). 


But as was pointed out earlier, at the ends of the string the formulas 
expressing the laws of mechanics for interior points cease to apply. Thus 
at both ends we must assign supplementary conditions. If, for example, 
the string is fixed in a position of equilibrium at both ends, then we will 
hâve 

«lx-o = «U-i = 0. 

These conditions can sometimes be replaced by more general ones, but a 
change of this sort is not of basic importance. 

The problem of finding the necessary solutions of équation (9) is analo- 
gous. In order that such a solution be well defined, it is customary to 
assign the conditions 

P It-o = Mx, y, z), 

5? , o = (l3) 

and also one of the “boundary conditions” 

p\ s = <HQ)> ( 14 ) 

ï\r m - <l4 ' ) 

a %\ s +pp ^ = x{Q) * (i4 ' } 

* If the right-hand sides in conditions (13) and (14) are equal to zéro, such conditions 
are called “homogeneous.” 
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The différence from the preceding case is simply that instead of the one 
initial condition in équation (11) we hâve the two conditions (13). 

Equations (14) obviously express the physical laws for the particles on 
the boundary of the volume in question. 

The proof that in the general case the conditions (13) together with an 
arbitrary one of the conditions (14) uniquely define a solution of the 
problem will be omitted. We wiil show only that the solution can be 
unique for one of the conditions in (14). 

Let it be known that a function u satisfies the équation 

d 2 u _ 1 d*u 

Ix* ~ tfJi*' 


with initial conditions 


l.-o - 0, dt 


= 0 


(-0 


and boundary condition 


— 

dn 


s 


= 0 . 


(Jt would be just as easy to discuss the case in which u | s = 0.) 

We will show that under these conditions the function u must be 
identically zéro. 

To prove this property it will not be sufficient to use the arguments 
introduced earlier to establish the uniqueness of the solution of the first 
two problems. But here we may make use of the physical interprétation. 

We will need just one physical law, the “law of conservation of energy.” 
We restrict ourselves again for simplicity to the vibrating string, the 
displacement of whose points u(x, t ) satisfies the équation 


_ d*u 

T dx* p dt * • 


The kinetic energy of each particle of the string oscillating from x to x + dx 
is expressed in the form 


1 

2 



Along with its kinetic energy, the string in its displaced position also 
possesses potential energy created by its increase of length in comparison 
with the straight-line position. Let us compute this potential energy. We 
concern ourselves with an element of the string between the points x and 
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x + dx. This element has an inclined position with respect to the axis Ox, 
such that its length is approximately equal to 

'J (dx)2 + Ê dx î'’ 

so its élongation is 



Multiplying this élongation by the tension T, we find the potential energy 
of the elongated element of the string 



The total energy of the string of length / is obtained by summing the 
kinetic and potential energies over ail of the points of the string. We get 

If the forces acting on the end of the string do no work, in particular if 
the ends of the string are fixed, then the total energy of the string must be 
constant. 

E = const. 

Our expression for the law of conservation of energy is a mathematical 
corollary of the basic équations of mechanics and may be derived from 
them. Since we hâve already written the laws of motion in the form of 
the diflferential équation of the vibrating string with conditions on the 
ends, we can give the following mathematical proof of the law of conserva¬ 
tion of energy in this case. If we differentiate E with respect to time, we 
hâve, from basic general rules, 

dE r‘ /_du (Pu du d*u\ , 

dt ~ J 0 \ T Si dxdt + p et dt*) dx ' 

Using the wave équation (6) and replacing p{dhi/dt *) by T(d*u/dx*), we 
get dEjdt in the form 


dE 

dt 



du d*u 
dx dx dt 



du d*u t 
dt dx *J 



T — — 
' dxli 


du du 

dx dt 


1 x-i 


Ix-o 
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If ( du/dx ) ! z _ 0 or m !*_„ vanishes, and also (du/dx)\ x ., or u vanishes. 


then 



which shows that E is constant. 

The wave équation (9) may be treated in exactly the same way to prove 
that the law of conservation of energy holds here also. If p satisfies équation 
(9) and the condition 


P \s = 0 or 



= 0 , 


then the quantity 

e=IHl(£) , +(M + (£î + M£)h^ 

will not dépend on t. 

If, at the initial instant of time, the total energy of the oscillations is 
equal to zéro, then it will always remain equal to zéro, and this is possible 
only in the case that no motion occurs. If the problem of integrating the 
wave équation with initial and boundary conditions had two solutions 
Pi and , then v = p x — p % would be a solution of the wave équation 
satisfying the conditions with zéro on the right-hand side, i.e., homoge- 
neous conditions. 

In this case, when we calculated the “energy” of such an oscillation, 
described by the function v, we would discover that the energy E(v) is 
equal to zéro at the initial instant of time. This means that it is always 
equal to zéro and thus that the function v is identically equal to zéro, so 
that the two solutions p x and p ï are identical. Thus the solution of the 
problem is unique. 

In this way we hâve convinced ourselves that ail three problems are 
correctly posed. 

Incidentally, we hâve been able to discover some very simple properties 
of the solutions of these équations. For example, solutions of the Laplace 
équation hâve the following maximum property: Functions satisfying this 
équation hâve their largest and smallest values on the boundaries of their 
domains of définition. 

Functions describing the distribution of heat in a medium hâve a 
maximum property of a different form. Every maximum or minimum of 
température occuring at any point gradually disperses and decreases with 
time. The température at any point can rise or fall only if it is lower or 
higher than at nearby points. The température is smoothed out with the 
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passage of time. Ail unevennesses in it are leveled out by the passage of 
heat from hot places to cold ones. 

But no smoothing-out process of this kind occurs in the propagation 
of the oscillations considered here. These oscillations do not decrease or 
level out, since the sum of their kinetic and potential energies must remain 
constant for ail time. 


§4. The Propagation of Waves 

The properties of oscillations can be very clearly demonstrated by the 
simplest examples. Let us consider two characteristic cases. 

Our first example is the équation of the vibrating string 


dhi _ 1 d*u 
dx 2 a* dt* ' 


(15) 


This équation, as may be proved, has two particular solutions of the 
form 

“i = <t> i(* - or), a 2 = fa(x + at), 

where fa and fa are arbitrary twice-differentiable functions. 

By direct différentiation it is easy to show that the functions a, and 
satisfy équation (15). It may be shown that 

a = a, + 


is a general solution of this équation. 

The general form of the oscillations described by the functions a, and a 2 
is of considérable interest. To consider it in the most convenient fashion, 
we mentally carry out the following experiment. Let the observer of the 
vibrating string be himself not stationary but moving along the axis Ox 
with velocity a. For such an observer the position of a point on the string 
will be defined not by a stationary coordinate System but by a moving 
one. Let f dénoté the x-coordinate of this System. Then £ = 0 will 
obviously correspond at each instant of time to the value x — at. Hence 
it is clear that 

£ = x — at. 

We can represent an arbitrary function u(x, t) in the form 

u(x, t ) = <£(£, r). 

For the solution a, we will hâve 

ai(x, t) = fa(Ç), 



26 


VI. PARTIAL DIFFERENT/AL EQUATIONS 


so that in this coordinate System the solution u,(x, i) turns out to be 
independent of time. Consequently, for an observer moving with velocity 
a, the string looks like a stationary curve. For a stationary observer, 
however, the string appears to hâve a wave flowing along the axis Ox with 
velocity a. 

In exactly the same way the solution u t (x, t ) may be considered as a 
wave travelling in the opposite direction with velocity a. With an infinité 
string both waves will be propagated infinitely far. Moving in different 
directions they may, by their superposition, produce quite strange shapes 
in the string. The résultant displacement may be increasing at certain 
times and decreasing at others. 




Fig. 3. 


If k, and K.j, as they arrive at a given point from opposite sides, hâve 
the same sign, then they augment each other, but if they hâve opposite 
signs, they counteract each other. Figure 3 shows several successive 
positions of the string for two particular displacements. Initially 
the waves move independently toward each other, and then begin 
to interact. In the second case in figure 3 there will be an instant of 
complété annihilation of the oscillations, after which the waves again 
separate. 

Another example that easily lends itself to qualitative investigation is 
the propagation of waves in space. 
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The équation 


Au ~ a 1 dt * 


derived earlier, has two particular solutions of the form 




u t = - + at). 


(16) 


(17) 


where r dénotés the distance of a given point from the origin of the 
coordinate System r* = x 2 + y 1 + z 1 , and <f> 1 and are arbitrary, twice- 
differentiable functions. 

The proof that m, and u t are solutions would take considérable time and 
is omitted here. 

The form of the waves described by these solutions is in general the 
same as for the string. If we pay no attention to the factor 1/r occuring 
on the right, then the first solution represents a wave travelling in the 
direction of increasing r. This wave is spherically symmetric; it is identical 
at ail points that hâve the same value of r. 

The factor 1/r produces the resuit that the amplitude of the wave is 
inversely proportional to the distance from the origin. Such an oscillation 
is called a diverging spherical wave. A good picture of it is given by the 
circles that spread out over the surface of the water when a stone is 
thrown into it, except that in this case the waves are circular rather than 
spherical. 

This second solution of (17) is also of great interest; it is called a 
converging wave, travelling in the direction of the origin. Its amplitude 
grows with time to infinity as it approaches the origin. We see that such a 
concentration of the disturbance at one point may lead, even though the 
initial oscillations are small, to an immense upheaval. 


§5. Methods of Constructing Solutions 

On the possibility of decomposing an y solution into simpler solutions. 

Solutions of the problems of mathematical physies formulated previously 
may be derived by various devices, which are different spécifie problems. 
But at the basis of these methods there is one general idea. As we hâve 
seen, ail the équations of mathematical physies are, for small values of 
the unknown functions, linear with respect to the functions and their 
dérivatives. The boundary conditions and initial conditions are also 
linear. 

If we form the différence between any two solutions of the same 
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équation, this différence will also be a solution of the équation with the 
right-hand terms equal to zéro. Such an équation is called the corre- 
sponding homogeneous équation. For example, for the Poisson équation 
Au = — 4np, the corresponding homogeneous équation is the Laplace 
équation Au = 0. 

If two solutions of the same équation also satisfy the same boundary 
conditions, then their différence will satisfy the corresponding homo¬ 
geneous condition: The values of the corresponding expression on the 
boundary will be equal to zéro. 

Hence the entire manifold of the solutions of such an équation, for 
given boundary conditions, may be found by taking any particular solution 
that satisfies the given nonhomogeneous condition together with ail 
possible solutions of the homogeneous équation satisfying homogeneous 
boundary conditions (but not, in general, satisfying the initial conditions). 

Solutions of homogeneous équations, satisfying homogeneous boundary 
conditions may be added, or multiplied by constants, without ceasing to 
be solutions. 

If a solution of a homogeneous équation with homogeneous conditions 
is a function of some parameter, then integrating with respect to this 
parameter will also give us such a solution. These facts form the basis of 
the most important method of solving linear problems of ail kinds for the 
équations of mathematical physics, the method of superposition. 

The solution of the problem is sought in the form 

« = m 0 + X ’ 

where u 0 is a particular solution of the équation satisfying the boundary 
conditions but not satisfying the initial conditions, and the u k are solutions 
of the corresponding homogeneous équation satisfying the corresponding 
homogeneous boundary conditions. If the équation and the boundary 
conditions were originally homogeneous, then the solution of the problem 
may be sought in the form 

« = X• 

In order to be able to satisfy arbitrary initial conditions by the choice of 
particular solutions u k of the homogeneous équation, we must hâve 
available a sufficiently large arsenal of such solutions. 

The method of séparation of variables. For the construction of the 
necessary arsenal of solutions there exists a method called séparation of 
variables or Fourier's method. 
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Let us examine this method, for example, for solving the problem 



(18) 


“ Is = 0, u |,_o = f 0 (x, y, z), U||,_o = /,(*, y, z). 


In looking for any particular solution of the équation, we first of ail 
assume that the desired function u satisfies the boundary condition u | 5 = 0 
and can be expressed as the product of two functions, one of which dépends 
only on the time t and the other only on the space variables: 


u(x, y, z, I) = U(x, y, z) T(t). 

Substituting this assumed solution into our équation, we hâve 
T(r) AU = T'\t) U. 


Dividing both sides by TU gives 


_AU 
T ~ U ' 


The right side of this équation is a function of the space variables only 
and the left is independent of the space coordinates. Hence it follows that 
the given équation can be true only if the left and right sides hâve the 
same constant value. We are led to a System of two équations 



The constant quantity on the right is denoted here by —AJ in order to 
emphasize that it is négative (as may be rigorously proved). The subscript 
k is used here to note that there exist infinitely many possible values of 
—AJ, where the solutions corresponding to them form a System of 
functions complété in a well-known sense. 

Cross-multiplying in both équations, we get 

T" + X\T=0; AU + X*U = 0. 

The first of these équations has, as we know, the simple solution 

T = A k cos X k t + B k sin X k t, 

where A k and B k are arbitrary constants. This solution may be further 
simplified by introducing the auxiliary angle <f>. We hâve 


VAl + ffi 


= sin^ fc . 


B k 


VA* 


= cos <j > k , VA 2 + B* = M k 


Bl 
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Then 


T = VA\ + B\ sin (\t + 4> k ) = M k sin (A k t + <f> k ). 


The function T represents a harmonie oscillation with frequency X k , 
shifted in phase by the angle <f> k . 

More difficult and more interesting is the problem of finding a solution 
of the équation 

AU + \\U=0 (19) 

for given homogeneous boundary conditions; for example, for the 
conditions 

U\ s = 0 


(where S is the boundary of the volume Si under considération), or for 
any other homogeneous condition. The solution of this problem is not 
always easy to construct as a finite combination of known functions, 
although it always exists and can be found to any desired degree of 
accuracy. 

The équation AU + AJt/ = 0 for the condition U\ s = 0 has first of 
ail the obvious solution U = 0. This solution is trivial and completely 
useless for our purposes. If the A* are any randomly chosen numbers, 
then in general there will not be any other solution to our problem. 
However, there usually exist values of X k for which the équation does hâve 
a nontrivial solution. 

Ail possible values of the constant A* are determined by the requirement 
that équation (19) hâve a nontrivial solution, i.e., distinct from the 
identically vanishing function, which satisfies the condition (/| s = 0. 
From this it also follows that the numbers denoted by —A* must be 
négative. 

For each of the possible values of A* in équation (19), we can find at 
least one function U k . This allows us to construct a particular solution 
of the wave équation (18) in the form 

u k = M k sin (A k t + <f> k ) U k {x, y, z). 

Such a solution is called a characteristic oscillation (or eigenvibration) of 
the volume under considération. The constant X k is the frequency of the 
characteristic oscillation, and the function U^x, y, z) gives us its form. 
This function is usually called an eigenfunction (characteristic function). 
For ail instants of time, the function u k , considered as a function of the 
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variables x, y, and z, will differ from the function U k (x, y, z) only in 
scale. 

We do not hâve space here for a detailed proof of the many remarkable 
properties of characteristic oscillations and of eigenfunctions; therefore 
we will restrict ourselves merely to listing some of them. 

The first property of the characteristic oscillations consists of the fact 
that for any given volume there exists a countable set of characteristic 
frequencies. These frequencies tend to infinity with increasing k. 

Another property of the characteristic oscillations is called orthogonality. 
It consists of the fact that the intégral over the domain Q of the product 
of eigenfunctions corresponding to different values of A* is equal to zéro.* 

J// UJ,x, y, z) U^x, y, z) dx dydz = 0 (j * k). 
o 

For j = k we will assume 

/// U ^ X ' y ' Z >* dx dy dz = '• 
o 

This can always be arranged by multiplying the functions U^x, y, z) by 
an appropriate constant, the choice of which does not change the fact 
that the function satisfies équation (19) and the condition U | s = 0. 

Finally, a third property of the characteristic oscillations consists of the 
fact that, if we do not omit any value of X k , then by means of the eigen¬ 
functions (/*(*, y, z), we can represent with any desired degree of exactness 
a completely arbitrary function f(x,y,z), provided only that it satisfies 
the boundary condition f\ s — 0 and has continuous first and second 
dérivatives. Any such function f(x, y, z) may be represented by the 
convergent sériés 

f(x,y,z)=--XC k U k {x,y,z). (20) 

k-l 


The third property of the eigenfunctions provides us in principle with 
the possibility of representing any function J\x, y, z) in a sériés of eigen¬ 
functions of our problem, and from the second property we can find ail 


* If to one and the same value of A there correspond several essentially different 
(linearly independent) functions U, then this value of A is considered as occurring a 
corresponding number of times in the set of eigenvalues A, . The condition of ortho¬ 
gonality for functions corresponding to the same value of A, may be ensured by proper 
choice of these functions. 
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the coefficients of this sériés. In fact, if we multiply both sides of équation 
(20) by U s (x, y, z) and integrate over the domain Q, we get 


y, z) U,(x, y, z) dx dy dz 
a 


= X C * f f f U^x, y, z ) U^x, y, z) dx dy dz. 


In the sum on the right, ail the terms in which k =£ j disappear because 
of the orthogonality, and the coefficient of C, is equal to one. Consequently 
we hâve 

Ci = y . 2) U,(X, y, z ) dx dy dz. 

a 

These properties of the characteristic oscillations now allow us to solve 
the general problem of oscillation forany initial conditions. 

For this we assume that we hâve a solution of the problem in the form 

t/ = S U k (x , y, z ) ( A k cos X k t + B k sin A*/) (21) 

and try to choose the constants A k and B k so that we hâve 


« l«-o = /o(*. y> z). 


du 

di 


1-0 


= A(.x, y, z). 


Putting / = 0 in the right side of (21), we see that the sine terms disappear 
and cos A k t becomes equal to one, so that we will hâve 


M x , y, z) = X A k u ^x, y, z). 

k -1 

From the third property, the characteristic oscillations can be used for 
such a représentation, and from the second property, we hâve 

Ak = J z) U k (x, y, z) dx dy dz. 
a 

In the same way, differentiating formula (21) with respect to t and putting 
/ = 0, we will hâve 

00 

= fi(x, y, Z) = X A(B k cos A k t - A k sin A k t) \,. 0 U k (x, y, z) 

(-0 î-l 

= X ABhU^x, y, z). 
k“l 


du 

di 
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Hence, as before, we obtain the values of B k as 

Bk = T k SSS ^ X ' y ' Z * U ^ X ’ y ’ z ^ dx dy dz ’ 

Knowing A k and B k , we in fact know both the phases and the amplitudes 
of ail the characteristic oscillations. 

In this way we hâve shown that by addition of characteristic oscillations 
it is possible to obtain the most general solution of the problem with 
homogeneous boundary conditions. 

Every solution thus consists of characteristic oscillations, whose 
amplitude and phase we can calculate if we know the initial conditions. 

In exactly the same way, we may study oscillations with a smaller 
number of independent variables. As an example let us consider the 
vibrating string, fixed at both ends. The équation of the vibrating string 
has the form 

d*u . d*u 
dp ~ ° dx* ' 

Let us suppose that we are looking for a solution of the problem for a 
string of length /, fixed at the ends 

« I z-o = u = 0. 

We will look for a collection of particular solutions 

u k = T k (i) U k (x). 

We obviously obtain, just as before, 

T’ k U k = o*U’ k T k , 

or 

Hence 



T k = A k cos A k t + B k sin A k t, 

U k = M k cos — x + N k sin — x. 

a a 


We use the boundary conditions in order to find the values of A* . For 
general A* it is not possible to satisfy both the boundary conditions. From 
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the condition U k |,_o = 0 we get M k = 0, and this means that U k = N k 
sin (A k /a) x. Putting x = I, we get sin (A k l/a) = 0. This can only happen if 
A k l/a = kn, where k is an integer. This means that 


K — 


akn 


The condition J U\dx = 1 shows that N k = . Finally 

. I2 . knx aknt . aknt 

U k (x) = Jj sin —j— . T k = A k cos —j— B k sin —j— . 

In this manner the characteristic oscillations of the string, as we see, 
hâve sinusoidal form with an intégral number of half waves on the entire 
string. Every oscillation has its own frequency, and the frequencies may 
be arranged in increasing order 

an . an an an 
I » 1 i » i i ’ ”'* K i » ’" • 

It is well known that these frequencies are exactly those that we hear in 
the vibrations of a sounding string. The frequency is called the fundamental 
frequency, and the remaining frequencies are overtones. The eigenfunctions 
V2//sin ( knx/I) on the interval 0 < x < / change sign k — 1 times, since 
knx/l runs through values from 0 to kn, which means that its sine changes 
sign k — 1 times. The points where the eigenfunctions U k vanish are 
called nodes of the oscillations. 

If we arrange in some way that the string does not move at a point 
corresponding to a node, for example of the first overtone, then the 
fundamental tone will be suppressed, and we will hear only the Sound of the 
first overtone, which is an octave higher. Such a device, called stopping, 
is made use of on instruments played with a bow: the violin, viola, and 
violoncello. 

We hâve analyzed the method of separating variables as applied to 
the problem of finding characteristic oscillations. But the method can be 
applied much more widely, to problems of heat flow and to a whole sériés 
of other problems. 

For the équation of heat flow 


with the condition 




T\ s = 0 
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we will hâve, as before. 


Here 


T = ZF k (t)U k (x,y, z). 


FA) 

FA) 


-Aî, AU k + \lU k = 0. 


The solution is obtained in the form 


T = '£ e -* UAx,y,z). 

*-i 

This method has also been used with great success to solve sonie 
other équations. Consider, for example, the Laplace équation 

Au = 0 

in the circle 

*• + /<!, 

and assume that we hâve to construct a solution satisfying the condition 

« lr-1 = /(«?). 

where r and & dénoté the polar coordinates of a point in the plane. 

The Laplace équation may be easily transformed into polar coordinates. 
It then has the form 

1 du 1 d*u _ . 
dr * r dr r* d& % 

We want to find a solution of this équation in the form 

u = S R *< r > W- 

k-1 

If we require that every term of the sériés individually satisfy the équation, 
we hâve 

+ l ««r)] 8tf) + Jr e’ k (») RA) = 0 . 

Dividing the équation by R k (r) #*(#)//•*, we get 

** [«» + \ *«r>] W) 

RA) 9&) ’ 
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Again setting 

u W 

we hâve 

' 2 + J - Ai** = 0. 

It is easy to see that the function #*(#) must be a periodic function of 9 
with period 2n. Integrating the équation d" k (&) -F Aj0 t (#) = 0, we get 

9 k = a k cos A*# + b k sin X k &. 

This function will be periodic with the required period only if À* is an 
integer. Putting A* = k, we hâve 

d k ~ a k cos k& + b k sin k&. 

The équation for R k has a general solution of the form 

R k = Ar* + —f . 

Retaining only the term that is bounded for r -*• 0, we get the general 
solution of the Laplace équation in the form 

ce 

u = a 0 + ^ (a k cos k» + b k sin k&) r k . 

k -1 

This method may often be used to find nontrivial solutions of the 
équation A U k + A* U k = 0 that satisfy homogeneous boundary conditions. 
In case the problem can be reduced to problems of solving ordinary 
differential équations, we say that it allows a complété séparation of 
variables. This complété séparation of variables by the Fourier method 
can be carried out, as was shown by the Soviet mathematician V. V. 
Stepanov, only in certain spécial cases. The method of séparation of 
variables was known to mathematicians a long time ago. It was used 
essentially by Euler, Bernoulli, and d’Alembert. Fourier used it syste- 
matically for the solution of problems of mathematical physics, particularly 
in heat conduction. However, as we hâve mentioned, this method is often 
inapplicable; we must use other methods, which we will now discuss. 

The method of potentials. The essential feature of this method is, as 
before, the superposition of particular solutions for the construction of a 
solution in general form. But this time for the particular fundamental 
solutions, we use functions that become infinité at one point. Let us illus- 
trate with the Laplace and Poisson équations. 
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Let M 0 be a point of our space. We dénoté by r(M, MJ) the distance 
from the point M 0 to a variable point M. The function M 0 ) for a 

fixed A/ 0 is a function of the variable point M. It is easy to establish the 
fact that this function is a harmonie function of the point M in the entire 
space,* except of course, at the point A/ 0 , where the function becomes 
infinité, together with its dérivatives. 

The sum of several functions of this form 


N 

?, A ‘KM, Mi) ' 


where the points A/,, M z , •••, M N are any points in the space, is again a 
harmonie function of the point M. This function will hâve singularises at 
ail the points A/,-. If we choose the points A/,, A/*, •••, M N as densely 
distributed as we please in some volume Q, and at the same time multiply 
by coefficients A { , we may pass to the limit in this expression and get a 
new function 


U ‘ im ?, r(M, Mi) Ijl r(M , A /') dQ 


A(M') 


where the points M' range over ali of the volume Q. The intégral in this 
form is called a Newtonian potertial. It may be shown, although we will 
not do it here, that the function U thus constructed satisfies the équation 
AU = — 4irA. 

The Newtonian potential has a simple physical meaning. To understand 
it, we will begin with the function AJr(M, A/,). 

The partial dérivatives of this function with respect to the coordinates 
are 





= Z. 


At the point A/, we place a mass A,, which will attract ail bodies with 
a force directed toward the point M, and inversely proportional to the 
square of the distance from . We décomposé this force into its compo- 
nents along the coordinate axes. If the magnitude of the force acting on a 
material point of unit mass is AJr *, the cosines of the angles between the 
direction of this force and the coordinate axis will be (x, — x)/r, (jt — y)/r , 
(z, — z)/r. Thus the components of the force exerted on a unit mass at the 
point M by an attracting center A/, will be equal to X, Y, and Z, the 
partial dérivatives of the function AJr with respect to the coordinates. If 


That is, the function satisfies the Laplace équation. 
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we place attracting masses at points A/,, A/ 2 , —, , then every material 

point with unit mass placed at a point M wül be acted on by a force equal 
to the résultant of ail the forces acting on it from the given points M t . 
In other words 


X = 


dx& r(M, M,)' 


Y = 


— V Ai z = 

r(M, Mi) ’ Z ' 


dz ^ r{M, M,) ' 


Passing to the limit and replacing the sum by an intégral, we get 


X = 


dU 
3x ’ 


Y 


du - _ du_ 

dy ’ Z 0z ’ 


where 



The function t/, with partial dérivatives equal to the components of the 
force acting on a point, is called the potential of the force. Thus the function 
A { /r(M, M ( ) is the potential of the attraction exerted by the point M ,, 
the function E [AJr(M, A/,)] is the potential of the attraction exerted by 
the group of points A/,, A/ 2 , —, M N , and the function U = JJJ n ( A/r ) dQ 
is the potential of the attraction exerted by the masses continuously 
distributed in the volume Q. 

Instead of distributing the masses in a volume, we may place the points 
A/,, A/j, —, M n on a surface S. Again increasing the number of these 
points, we get in the limit the intégral 


V = 



s 


( 22 ) 


where Q is a point on the surface S. 

It is not difficult to see that this function will be harmonie everywhere 
inside and outside the surface S. On the surface itself the function is 
continuous, as can be proved, although its partial dérivatives of the first 
order hâve finite discontinuities. 

The functions 3(l/r)/3jr„ d(l/r)ldy f , and d(l/r)/dz, also are harmonie 
functions of the point M for fixed M,. From these functions in turn, we 
may form the sums 


ai 


ai 


ai 


S A * Wi + S W. + X c * aT ’ 


a.v, 




which will be harmonie functions everywhere except perhaps at the points 
A/j, M t , —, M n . 
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Of particular importance is the intégral 


10 - 


d- 


ai 


w = SS ^Q)[w cos cos (*• y ï + W cos ** 

= jjp(Q)K(Q,M)ds, (23) 


in which x', ÿ, and z' are the coordinates of a variable point Q on the 
surface S, n is the direction of the normal to the surface S at the point Q 
while x, y, and z are the directions of the coordinate axes, and r is the 
distance from Q to the point M at which the value of the function W is 
defined. 

The intégral (22) is called the potential of a simple loyer, and the intégral 
(23) the potential of a double loyer.* The potential of a double layer and 
the potential of a simple layer represent a function harmonie inside and 
outside of the surface S. 

Many problems in the theory of harmonie functions may be solved by 
using potentials. By using the potential of a double layer, we may solve 
the problem of constructing, in a given domain, a harmonie function u, 
having given values 2mf>(Q) on the boundary 5 of the domain. In 
order to construct such a function, we only need to choose the function 
p(Q) in a suitable way. 

This problem is somewhat reminiscent of the similar problem of finding 
the coefficients in the sériés 

4> = X a * Uk 

so that it may represent the function on the left side. 

A remarkable property of the intégral W consists of the fact that its 
limiting value as the point M approaches Q 0 from the inner side of the 
surface has the form 

Jim W = 2np(Q 0 ) + jj K(Q, Q 0 ) p(Q) ds. 


* The names of these potentials are connected with the following physical fact. We 
assume that on the surface S, we hâve introduced electrical charges. They create in 
the space an electric field. The potential of this field will be represented by the intégral 
(22), which is therefore called the potential of a simple layer. 

We now assume that the surface S is a thin nonconducting film. On one side of it 
we distribute, according to some law, electric charges of one sign (for example, positive). 
On the other side of 5 we distribute, with the same law, electric charges of opposite 
sign. The action of these two electric iayers also générâtes in the space an electric field. 
As can be calculated, the potential of this field will be represented by the intégral (23). 
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Equating this expression to the given function 2n4(Q 0 ), we get the 
équation 

p(Qo) + 2 ^// *(ô, Co)M0 ds = 4m. 

s 

This équation is called an intégral équation of the second kind. The theory 
of such équations has been developed by many mathematicians. If we 
can solve this équation by any method, we obtain a solution of our original 
problem. 

In exactly the same way, we may find a solution of other problems in the 
theory of harmonie functions. After choice of a suitable potential, the 
density, i.e., the value of an arbitrary function appearing in it, is defined 
in such a way that ail the prescribed conditions are fulfilled. 

From a physical point of view, this means that every harmonie function 
may be represented as the potential of a double electric layer, if we 
distribute this layer over a surface S with appropriate density. 


Approximate construction of solutions; Galerkin’s method and the 
method of nets. 1. We hâve discussed two methods for solving équations 
of mathematical physics: the method of complété séparation of variables 
and the method of potentials. These methods were developed by scientists 
of the 18th and 19th centuries, Fourier, Poisson, OstrogradskiT, Ljapunov, 
and others. In the 20th century they were augmented by a sériés of other 
methods. We will examine two of them, Galerkin’s method and the method 
of finite différences, or the method of nets. 

The first method was proposed by the Academician B. G. Galerkin for 
the solution of équations of the form 


2222 ^ 


PU 


<>u 8x, Sx, dx k dx, 
PU 


+222 Bi,k 


PU 


dXi dx, dx k 




dx, dx 


dx, 


containing an unknown parameter A, where the indices /, j, k, and / 
independently take on the values I, 2, and 3. These équations are derived 
from équations containing an independent variable t, by using the method 
of séparation of variables in the same way as the wave équation 



leads to the équation AU + A*i/ = 0. The problem consists of finding 
those values of A for which the homogeneous boundary-value problem has 
a nonzero solution and then constructing that solution. 
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The essence of Galerkin’s method is as follows. The unknown function 
is sought in the approximate form 


N 

U « X ’ ** « *s). 

m -1 


where the oj„(x 1 , x 2 , x 3 ) are arbitrary functions satisfying the boundary 
conditions. 

The assumed solution is substituted in the left side of the équation, 
resulting in the approximate équation 


N 

2 >. 




d 3 w„ 


dXi dX) dx k 


+ss c <-+s°<^ + H+- ». 


For brevity we dénoté the expression inside the brackets by Lto m , and 
write the équation in the form 

X + AX % 0 . 

Now we multiply both sides of our approximate équation by w„ and 
integrate over the domain Q in which the solution is sought. We get 


JJJ X a mU«Loj m dQ + X J J J X O,„io m iu„ dû « 0, 

n a 

which may be rewritten in the form 

X fl> " /JJ œ * Lu> m d® + ^ X flm JJJ dQ as 0. 

m-1 Q tn-1 fi 


If we set ourselves the aim of satisfying these équations exactly, we will 
hâve a System of algebraic équations of the first degree for the unknown 
coefficients a m . The number of équations in the System will be equal to 
the number of unknowns, so that this System will hâve a nonvanishing 
solution only if its déterminant is zéro. If this déterminant is expanded, 
we get an équation of the <Vth degree for the unknown number À. 

After finding the value of X and substituting it in the System, we solve 
this System to obtain approximate expressions of the function U. 

Galerkin’s method is not only suitable for équations of the fourth order, 
but may be applied to équations of different orders and different types. 
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2. The last of the methods that we will examine is called the method of 
finite différences or the method of nets. 

The dérivative of the function u with respect to the variable x is defined 
as the limit of the quotient 

u(x + Ax ) — u{x) 

Ax 


This quotient in its turn may be represented in the form 


1 

Ax 




and from the well-known theorem of the mean value (cf. Chapter II, §8): 

u(x + Ax) — k(x) _ du_ | 

Ax dx L_ { ’ 

where f is a point in the interval 

x < f < x + Ax. 

Ail the second dérivatives of u, both the mixed dérivatives and the dériv¬ 
atives with respect to one variable, may also be approximately represented 
in the form of différence quotients. Thus the différence quotient 


u(x Ax) — 2u(x) + K* — Ax) 
{Ax? 


is represented in the form 


J_ r u{x + Ax) - u(x) 
Ax L Ax 


u(x) — u(x — Ax) 
Ax 


1 + Ax) - «KxQ i I 1 '" 1 

Axll Ax i 


From the mean-value theorem the différence quotient of the function 




u(x x + Ax) - u{x{) 
Ax 


may be replaced by the value of the dérivative. Consequently 

<Kxi) ~ <Hxi ~ Ax) = 
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where f is some intermediate value in the interval 

x — Ax < Ç < x. 

Thus 


("zr) + 2 "M+*<* - 

= [<£(*) - <Hx - Ax)] = <f >'(£). 

On the other hand 

which means that 

W)- 


Once more using the formula for finite incréments, we see that 

MÙ = «"(’?), 

where 

f f + Ax. 

Consequently, 

(-^-) [«(■* + 41a:) - 2 u(x) + u(x - Ax)] = u'(-q), 


where x — Ax < rj < x + Ax. 

If the dérivative u"(x) is continuous and the value of Ax is sufficiently 
small, then u''(ij) will be only slightly different from u"(x). Thus our 
second dérivative is arbitrarily close to the différence quotient in question. 
In exactly the same way it may be shown, for example, that the mixed 
second dérivative 

d*u 
dx dy 

can be approximately represented by the formula 

Sr y = -à^ Wx+Ax ’ y+Ay) * x+Ax - y) 

-u(x, y + A y) + u(x, y)]. 


We return now to our partial differential équation. 
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For definiteness, let us assume that we are dealing with the Laplace 
équation in two independent variables 

dx 2 + df ~ 

Further, let the unknown function u be given on the boundary S of the 
domain 12. As an approximation we assume that 

— = ^ x + ^ x ■ y) ~ 2u ( x > y) + “(* ~ Ax, y) 

dx 2 (Ax) 2 

Pu = u(x, y + Ay)~ 2 u(x, y ) + u(x, y - A y) 

df (A y) 2 

If we put Ax = Ay = h, then 

1 

Ô75 + = T* Mjc + A, >-) + u(x, y + h) + u(x - h, y) 

^ +u(x,y-h)-4u(x,y)). 

Now let us cover the domain 
12 with a square net with 
vertices at the points x = kh, 
y = bh (figure 4). We replace 
the domain by the polygon 
consisting of those squares 
of our net that fall inside 12, 
so that the boundary of the 
domain is changed into a 
broken line. We take the 
values of the unknown func¬ 
tion on this broken line to be 
Fio. 4. those given on the boundary 

of 5. The Laplace équation 

is then approximated by the équation 

u(x + h, y) + u(x, y + h) + u{x — h, y) + u(x, y — h) — Au(x, y) = 0 

for ail interior points of the domain. This équation may be rewritten in 
the form 

u (x, ÿ) = i K* + h, y) + u(x, y + h) + u(x - h, y) + u(x, y - h)]. 

Then the value of u at any point of the net, for example the point 1 in 
figure 4, is equal to the arithmetic mean of its values at the four adjacent 
points. 

We assume that inside the polygon there are N points of our net. At every 
such point we will hâve a corresponding équation. In this manner we get a 
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System of N algebraic équations in N unknowns, the solution of which 
gives us the approximate values of the function u on the domain Q. 

It may be shown that for the Laplace équation the solution may be 
found to any desired degree of accuracy. 

The method of finite différences reduces the problem to the solution of 
a System of N équations in N unknowns, where the unknowns are the 
values of the desired function at the knots of some net. 

Further the method of finite différences can be shown to be applicable 
to other problems of mathematical physics: to other differential équations 
and to intégral équations. However its application in many cases involves 
a number of difficultés. 

h may turn out that the solution of the System of N algebraic équations 
in N unknowns, constructed by the method of nets, either does not exist 
in general or gives a resuit that is quite far from the true one. This happens 
when the solution of the System of équations leads to accumulation of 
errors; the smaller we take the length of the sides of the squares in the net 
the more équations we get, so that the accumulated error may become 
greater. 

In the example given previously of the Laplace équation, this does not 
happen. The errors in solving this System do not accumulate but, on the 
contrary, steadily decrease if we solve the System, for example, by a method 
of successive approximations. For the équation of heat fiow and for the 
wave équation it is essential to choose the nets properly. For these équa¬ 
tions we may get both good and bad results. 

If we are going to solve either of these équations by the method of nets, 
after choosing the net for the values of /, we must not choose too fine a net 
for the space variables. Otherwise we get a very unsatisfactory System of 
équations for the values of the unknown function; its solution gives a 
resuit that oscillâtes rapidly with large amplitudes and is thus very far 
from the true one. 

The great variety of possible results may best be seen in a simple 
numerical example. Consider the équation 

du d*u 
Tt~ ~dx* 

for the équation of heat fiow in the case in which the température does not 
dépend on y or z. We take the mesh width of the net along the values of 
t equal to k and along the values of x equal to h 

du u(t -|- k, x ) - u(t, x) 
dt* k 

&u u( t, x + h) - 2 u(t, x) -f u{t, x - h) 
dx* ~ h 2 
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Then our équation may be written approximately in the form 

u(t + k, x) = u(t, x + A) + (l - 2 u(t, x) + u(t, x - A). 

If, for a certain mesh-point value of r, we know the values of u at the points 
x — h, x, and x + h, it is easy to find the value of u at the point x and the 
next mesh point t + k. Assume that the constant k , i.e., the mesh width 
in the net with respect to /, is already chosen. Let us consider two cases 
for the choice of h. We put A 2 = k in the first case and A 2 = 2k in the 
second and solve the following problem by the method of nets. 

At the initial instant, u = 0 for ail négative values of x, and u = 1 for 
ail nonnegative values of x. We will hâve, writing in one line the values 
of the unknown function u for the given instant, two tables: 


Table 1 


'v x 

1 N. 

— 5A 

-4A 

J 

— 3A 

— 2A 

-A 

0 

A 

2A 

3A 

4A 

5A 

0 

0 

n 


m 

0 

1 

1 

1 

1 

1 

1 

k 

0 



El 

1 

0 

1 

1 

1 

1 

1 

21: 

0 



i 

-1 

2 

0 

1 

1 

1 

1 

3k 

0 


1 

i ~ 2 

4 

-3 

3 

0 

1 


1 

4 k 

0 

i 

-3 

7 

-9 

10 

-6 

4 

0 

1 

1 

5k 

1 

-4 

11 

-19 

26 I 

-25 

20 

-10 

5 

0 

1 


Table 2 


m 




— 2A 

B 

B 

a 



4A 

5A 

0 

0 



O 

□ 

1 

i 

1 

1 

1 

1 

k 

0 

H 


0 

1 

2 

1 

2 

i 

1 

1 

1 

1 





1 

1 

3 

3 





2k 

0 

■ ■ 

■ ■ 





1 

1 

1 

1 





4 

4 

4 

4 








1 

1 

1 

1 

7 

7 




3k 


H 






-T 

1 

1 

1 

8 

8 

2 

2 

8 

8 




4k 


1 

1 

5 

5 

11 

11 

15 

15 

1 

1 

16 

16 

16 

16 

16 

Ï6 

16 

16 

1 

5k 

I 

1 

3 

3 

1 

• 

13 

13 

31 

31 

1 

32 

32 

16 

16 


2 

16 

16 

32 

32 

1 
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In Table 2 we obtain values, for any given instant of time, which vary 
smoothly from point to point. This table gives a good approximation to 
the solution of the heat-flow équation. On the other hand, in Table I, 
in which, as it would seem, the exactness should hâve been increased 
because of our finer division for the Jt-interval, the values of u oscillate 
very rapidly from positive values to négative ones and attain values that 
are much greater than the initially prescribed ones. It is clear that in this 
table the values are extraordinarily far from those that correspond to 
the true solution. 

From these examples it is clear that if we wish to use the method of nets 
to get sufficiently accurate and reliable results, we must exercise great 
discrétion in our choice of intervals in the net and must make preliminary 
investigations to justify the application of the method. 

The solutions obtained by using the équations of mathematical physics 
for these or other problems of natural science give us a mathematical 
description of the expected course or the expected character of the physical 
events described by these équations. 

Since the construction of a mode! is carried out by means of the 
équations of mathematical physics, we are forced to ignore, in our abstrac¬ 
tions, many aspects of these events, to reject certain aspects as non- 
essential and to select others as basic, from which it follows that the results 
we obtain are not absolutely true. They are absolutely true only for that 
scheme or model that we hâve considered, but they must always be 
compared with experiment, if we are to be sure that our model of the 
event is close to the event itself and represents it with a sufficient degree 
of exactness. 

The ultimate criterion of the truth of the results is thus practical ex¬ 
périence only. In the final analysis, there is just one criterion, namely 
practical expérience, although expérience can only be properly understood 
in the light of a profou nd and well-developed theory. 

If we consider the vibrating string of a musical instrument, we can 
understand how it produces its tones only if we are acquanted with the 
laws for superposition of characteristic oscillations. The relations that hold 
among the frequencies can be understood only if we investigate how these 
frequencies are determined by the material, by the tension in the string, 
and by the manner of fixing the ends. In this case the theory not only 
provides a method of calculating any desired numerical quantities but 
also indicates just which of these quantities are of fundamental importance, 
exactly how the physical process occurs, and what should be observed in 
it. 

In this way a domain of science, namely mathematical physics, not 
only grew out of the requirements of practice but in turn exercised its 
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own influence on that practice and pointed out paths for further progress. 

Mathematical physics is very closely connected with other branches of 
mathematical analysis, but we cannot discuss these connections here, 
since they would lead us too far afieid. 


§6. Generalized Solutions 

The range of problems in which a physical process is described by 
continuous, diflerentiable functions satisfying differential équations may 
be extended in an essential way by introducing into the discussion dis- 
continuous solutions of these équations. 

In a number of cases it is clear from the beginning that the problem 
under considération cannot hâve solutions that are twice continuously 
diflerentiable; in other words, from the point of view of the classical 
statement of the problem given in the preceding section, such a problem 
has no solution. Nevertheless the corresponding physical process does 
occur, although we cannot find functions describing it in the preassigned 
class of twice-differentiable functions. Let us consider some simple 
examples. 


1. If a string consists of two pièces of different density, then in the 
équation 


(Pu 

dp 


= o* 


cPu 
dx* 


(24) 


the coefficient will be equal to a different constant on each of the corre¬ 
sponding pièces, and so équation (24) will not, in general, hâve classical 
(twice continuously diflerentiable) solutions. 

2. Let the coefficient a be a constant, but in the initial position let the 
string hâve the form of a broken line given by the équation m|,_ 0 = <f>(x). 
At the vertex of the broken line, the function <f>(x) obviously cannot hâve 
a first dérivative. It may be shown that there exists no classical solution 
of équation (24) satisfying the initial conditions 

«l,-o = <K X )< = 0 

(here and in what follows u, dénotés du/dt). 


3. If a sharp blow is given to any small piece of the string, the resulting 
oscillations are described by the équation 


3 *« . &u ... . 

T?-* 


where /[x , t) corresponds to the eflect produced and is a discontinuous 
function, differing from zéro only on the small piece of the string and 
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during a short interval of time. Such an équation also, as can be easily 
established, cannot hâve classical solutions. 

These examples show that requiring continuous dérivatives for the 
desired solution strongly restricts the range of the problems we can solve. 
The search for a wider range of solvable problems proceeded first of ail 
in the direction of allowing discontinuités of the first kind in the dérivatives 
of highest order, for the functions serving as solutions to the problems, 
where these functions must satisfy the équations except at the points of 
discontinuity. It turns out that the solutions of an équation of the type 
Au = 0 or du/dt — Au = 0 cannot hâve such (so-called weak) discon¬ 
tinuités inside the domain of définition. Solutions of the wave équation 
can hâve weak discontinuités in the space variables x , y, z, and in t only 
on surfaces of a spécial form, which are called characteristic surfaces. If a 
solution u(x, y, z, i) of the wave équation is considered as a function 
defining, for t = i t , a scalar field in the x, y, z space at the instant , 
then the surfaces of discontinuity for the second dérivatives of u(x, y , z, /) 
will travel through the (x, y , z) space with a velocity equal to the square 
root of the coefficient of the Laplacian in the wave équation. 

The second example for the string shows that it is also necessary to 
consider solutions in which there may be discontinuous first dérivatives; 
and in the case of sound and light waves, we must even consider solutions 
that themselves hâve discontinuities. 

The first question that cornes up in investigating the introduction of 
discontinuous solutions consists in making clear exactly which discontin¬ 
uous functions can be considered as physically admissible solutions of an 
équation or of the corresponding physical problem. We might, for example, 
assume that an arbitrary piecewise constant function is “a single solution” 
of the Laplace équation or the wave équation, since it satisfies the équation 
outside of the lines of discontinuity. 

In order to clarify this question, the first thing that must be guaranteed 
is that in the wider class of functions, to which the admissible solutions 
must belong, we must hâve a uniqueness theorem. It is perfectly clear that 
if, for example, we allow arbitrary piecewise smooth functions, then this 
requirement will not be satisfied. 

Historically, the first principle for sélection of admissible functions was 
that they should be the limits (in some sense or other) of classical solutions 
of the same équation. Thus, in example 2, a solution of équation (24) 
corresponding to the function <f>(x), which does not hâve a dérivative at 
an angular point may be found as the uniform limit of classical solutions 
u n (x, t) of the same équation corresponding to the initial conditions 
«nl(-o = K ni |,_ 0 = 0, where the <j>„(x) are twice continuously 

différentiable functions converging uniformly to <f>(x) for n -*■ oo. 
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In what follows, instead of this principle we will adopt the following: 
An admissible solution u must satisfy, instead of the équation Lu = f 
an intégral identity containing an arbitrary function d>. 

This identity is found as follows: We multiply both sides of the équation 
Lu = f by an arbitrary function <P, which has continuous dérivatives with 
respect to ail its arguments of orders up through the order of the équation 
and vanishes outside of the finite domain D in which the équation is 
defined. The équation thus found is integrated over D and then trans- 
formed by intégration by parts so that it does not contain any dérivatives 
of u. As a resuit we get the identity desired. For équation (24), for example, 
it has the form 

D 

S. L. Sobolev has shown that for équations with constant coefficients 
these two principles for the sélection of admissible (or as they are now 
usually called, generalized) solutions, are équivalent to each other. But for 
équations with variable coefficients, the first principle may turn out to be 
inapplicable, since these équations may in general hâve no classical 
solutions (cf. example 1). The second of these principles provides the 
possibility of selecting generalized solutions with very broad assumptions 
on the dififerentiability properties of the coefficients of the équations. It is 
true that this principle seems at first sight to be overly formai and to hâve 
a purely mathematical character, which does not directly indicate how 
the problems ought to be formulated in a manner similar to the classical 
problems. 

We give here a modification that, it seems to us, is more appropriate 
physically, since it is directly connected with the well-known principle of 
Hamilton. 

As is well known, analysis of the methods of deducing various équations 
of mathematical physics led in the first half of the 19th century to the 
discovery of a new law known as Hamilton’s principle. Starting from this 
principle, it was possible to obtain in a uniform manner ail the known 
équations of mathematical physics. We will illustrate this by the example 
of the problem considered in §3 for the oscillations of a string of finite 
length with fixed ends. 

First of ail we construct the so-called Lagrange function L(t) for our 
string, namely the différence between the kinetic and potential energies. 
From what was said in §3 it follows that 

W) = J, p “‘ ~ T “*) dx - 
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5= (' Ut) dt 

assumes its minimum value for the function u(x, t ), corresponding to the 
true motion of the string compared with ail other functions v(x, y) which 
are equal to zéro for x = 0 and x = I and coincide with u(x, r,) and 
u(x, t 2 ) for t = /, and t = t 2 . Here /, and t 2 are fixed arbitrarily, and the 
functions v must hâve finite intégrais S. As a resuit of this principle the 
so-called first variation of 5 (cf. Chapter VIII) must be equal to zéro, ie., 

SS = f* f' (pu,*, - Tu,*,) dx dt = 0, (25) 

where <P(x, t) is an arbitrary function différentiable with respect to x and t 
and equal to zéro on the edges of the rectangle 0 ^ x ^ /, t 2 ^ t < t 2 . 

Equation (25) is also the condition that must be met by the desired 
function u(x, t). If we know that u(x, t) has dérivatives of the second 
order, then condition (25) may be put in a different form. Integrating 
(25) by parts and applying the fundamental lemma of the calculus of 
variations, we find that u(x, t ) must satisfy the équation 



which is identical with (24), if p and T are constants and T/p = ai 1 . 

It is not difficult to see that any solution u(x, t) of équation (26) satisfies 
the identity (25) for ail given <t>. The converse turns out to be false, since 
u(x, t) may in general not hâve second dérivatives. So we are extending 
the range of solvable problems, if we replace équation (26) by the identity 
(25). 

To détermine a spécifie oscillation of the string, we must add to the 


boundary conditions 

è 

•>% 

h 

* 

II 

P 

(27) 

the initial conditions 

u(x, 0) = 4> 0 (x), 
u,(x, 0) = faix). 

(28) 


If a solution is sought in the class of continuously différentiable func¬ 
tions, then conditions (27) and (28) may be stated separately from (25) 
as requirements to be met. But if we allow the proposed solution to be 
“worse,” then these conditions lose their meaning in the form given and 
they must be partly or wholly included in the intégral identity (25). 
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For example, let u(x, t) be continuous for 0 ^ x < /, 0 < t < T, but 
let its first dérivatives hâve discontinuities. The second équation in (28) 
then loses its meaning as a limiting condition. In this case the problem 
can be stated as follows: to find a continuous function u which fulfills 
condition (27) and the first of the conditions (28) for which the équation 

\ T f (pu t <P t - Tu x 0 t ) dxdi+ f' fa^x, 0)dx = 0 (29) 

is identically satisfied for ail continuous 4>(x, i) equal to zéro for x = 0, 
x = I and t = T. Here the functions u and must both hâve first dériv¬ 
atives whose squares are intégrable in the sense of Lebesgue on the 
rectangle 0 ^ ^ O^r^T. This last requirement for u means that 

the mean value with respect to time of the total energy of the string 

2 J J o / o (P“? + Tt à) dx dt 

must be finite. Such a restriction on the function u , and thus also on its 
possible variations <P, is a natural resuit of Hamilton’s principle. 

The identity (29) is precisely the condition that the first variation of 
the functional 

S = -L("f ^ ~ T "*) ^ dt + J 0 ^>“ 1»-»^ 

be equal to zéro. Thus the problem of the vibration of a fixed string in 
the case considered may be stated as the problem of finding the minimum 
of the functional 5 for ail functions t) which are continuous, satisfy 
condition (27), and are equal to u(x, T) for i = T. Moreover, the desired 
function must satisfy the first of conditions (28). 

This modification of Hamilton’s principle allows us not only to widen 
the class of admissible solutions of équation (24) but also to State a well- 
defined boundary-value problem for them. 

The fact that these generalized solutions or some of their dérivatives are 
not defined at ail points of the space does not lead to any contradiction 
with experiment, as was repeatedly pointed out by N. M. Gjunter, whose 
investigations were chiefly instrumental in establishing a new point of 
view for the concept of the solution of an équation of mathematical 
physics. 

For example, if we wish to détermine the flow of liquid in a channel, 
then in the classical présentation we must compute the velocity vector 
and the pressure at every point of the flow. But in practice we are never 
dealing with the pressure at a point but rather with the pressure on a certain 
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area and never with the velocity vector at a given point but rather with 
the amount of the liquid passing through some area in a unit of time. 
The définition of generalized solution thus proposes essentially the 
computation of just those quantifies that hâve direct physical meaning. 

In order that a larger number of problems may be solvable, we must 
seek the solutions among functions belonging to the widest possible class 
of functions for which uniqueness theorems still hold. Frequently such a 
class is dictated by the physical nature of the problem. Thus, in quantum 
mechanics it is not the State function ip(x), defined as a solution of the 
Schrôdinger équation, that has physical meaning but rather the intégral 
a, = Je^C*) <pr(x)dx, where the ip, are certain functions for which 
< oo. Thus the solution ip is to be sought not among the twice 
continuously différentiable functions but among the ones with intégrable 
square. In the problems of quantum electrodynamics, it is still an open 
question which classes of functions are the ones in which we ought to 
seek solutions for the équations considered in that theory. 

Progress in mathematical physics during the last thirty years has been 
closely connected with this new formulation of the problems and with 
the création of the mathematical apparatus necessary for their solution. 
One of the central features of this apparatus is the so-called embedding 
theorem of S. L. Sobolev. 

Particularly convenient methods of finding generalized solutions in one 
oranother of these classes of functions are: the method of finite différences, 
the direct methods in the calculus of variations (Ritz method and Trefftz 
method), Galerkin’s method, and functional-operator methods. These 
latter methods basically dépend on a study of transformations generated 
by these problems. We hâve already spoken in §5 of the method of finite 
différences and of Galerkin’s method. Here we will explain the basic ideas 
of the direct methods of the calculus of variations. 

Let us consider the problem of defining the position of a uniformly 
stretched membrane with fixed boundary. From the principle of minimum 
potential energy in a State of stable equilibrium the function m(x, y) must 
give the least value of the intégral 


J(u) = JJ («4 + u\) dx dy 

D 

in comparison with ail other continuously différentiable functions v(x, y) 
satisfying the same condition on the boundary, t>| 5 = c p, as the function u 
does. With some restrictions on <p and on the boundary S itcan be shown 
that such a minimum exists and is attained by a harmonie function, so 
that the desired function u is a solution of the Dirichlet problem 
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Au = 0, «| s = <f>. The converse is also true: The solution of the Dirichlet 
problem gives a minimum to the intégral J with respect to ail v satisfying 
the boundary condition. 

The proof of the existence of the function u, for which J attains its 
minimum, and its computation to any desired degree of accuracy may be 
carried out, for example, in the foilowing manner (Ritz method). We 
choose an infinité family of twice continuously différentiable functions 
{v„(x,y)},n = 0 , 1 , 2, equal to zéro on the boundary for n > Oandequal 
to tj> for n = 0. We consider J for functions of the form 


v ~ T . 

k-1 


where n is fixed and the C k are arbitrary numbers. Then J(v) will be a 
polynomial of second degree in the n independent variables C,, C 2 , •••, C„ . 
We détermine the C* from the condition that this polynomial should 
assume its minimum. This leads to a System of n linear algebraic équations 
in n unknowns, the déterminant of which is different from zéro. Thus the 
numbers C* are uniquely defined. We dénoté the corresponding v by 
v"(x, y). It can be shown that if the System {t>„} satisfies a certain condition 
of “completeness” the functions v n will converge, as n -*• oo, to a function 
which will be the desired solution of the problem. 

In conclusion, we note that in this chapter we hâve given a description 
of only the simplest linear problem of mechanics and hâve ignored many 
further questions, still far from completely worked out, which are 
connected with more general partial differential équations. 
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VII 


CURVES 
AND SURFACES 


§1. Topics and Methods in the Theory of Curves and Surfaces 

In a school course, geometry involves only the simplest curves: straight 
lines, broken lines, and circumferences and arcs of circles; and as for 
surfaces, merely planes, surfaces of polyhedra, spheres, cônes, and 
cylinders. In more extended courses other curves are considered, chiefly 
the conic sections: ellipses, parabolas, and hyperbolas. But the study of 
an arbitrary curve or surface is completely alien to elementary geometry. 
At first sight it is even unclear how any general properties could be 
selected for investigation when we are speaking of arbitrary curves 
and surfaces. Yet such an investigation is completely natural and 
necessary. 

In every kind of practical activity and expérience of nature, we con- 
stantly encounter curves and surfaces of widely different forms. The path 
of a planet in space, of a ship at sea, or of a projectile in the air, the 
track of a chisel on métal, of a wheel on the road, of a pen on the tape 
of a recording device, the shape of a camshaft governing the valves of a 
motor, the contours of an artistic design, the form of a dangling rope, 
the shape of a spiral spring coiled for some spécifie purpose, such examples 
are endless. The surfaces of various objects, thin shells, cisterns, the 
framework of an airplane, casings, sheetlike materials, provide an endless 
diversity of surfaces. Methods for the Processing of products, the optical 
properties of various objects, the streamlining of bodies, the rigidity or 
deformability of thin shells, these and many other features dépend to a 
great extent on the géométrie form of the surfaces of objects. 

Of course, the gouge left by a chisel on métal is not a mathematical 
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curve. A cistern, even with thin walls, is not a mathematical surface. 
But to a first approximation, which is sufficient for the study of many 
questions, actual objects may be rcpresented mathematically by curves 
and surfaces. 

In introducing the concept of a mathematical curve, we disregard ail 
the reasons why we cannot decrease the thickness without limit. By 
means of this abstract concept, we succeed in representing those (com- 
pletely concrète) properties of an object that are preserved when its 
thickness and breadth are decreased in comparison with its length. 

Similarly, if we disregard the limitations on our ability to decrease 
the thickness of a shell or to détermine precisely the actual boundaries 
of a given object, we are led to the concept of a mathematical surface. 
We will not give a rigorous description of these well-known concepts 
but will only remark that the exact mathematical définitions are not 
simple and belong to topology. 

Finally, an important source of interest in various curves and surfaces 
has been the development of mathematical analysis. It is sufficient to 
remember, for example, that a curve is the géométrie représentation of 
a function, which is the most important concept of analysis. Moreover, 
every one is familiar with graphs quite apart from any study of analysis. 

In elementary geometry as created by the ancient Greeks, there was 
nothing about arbitrary curves or surfaces, but even in elementary analytic 
geometry we are accustomed to say “every curve is represented by an 
équation” or “every équation in the two variables x and y represents 
a curve in the coordinate plane.” Similarly the coordinates of surfaces 
are given by the équations z = f{x, y) or F(x, y, z) = 0, and in general 
the coordinate method, by establishing a close connection between 
elementary geometry and analysis, enables us to define many different 
curves and surfaces. 

But analytic geometry, being restricted to the methods of algebra and 
elementary geometry, goes no further than the investigation of certain 
spécifie types of figures. The study of arbitrary curves and surfaces 
represents a new branch of mathematics, known as differential 
geometry. 

It must be admitted at once that differential geometry imposes on its 
curves and surfaces certain conditions arising from the methods of 
analysis. However, this is not an essential limitation on the diversity of 
the allowable curves and surfaces, since in the great majority of cases 
they are capable of representing actual objects with the necessary degree 
of précision. The name “differential geometry” itself gives an indication 
of the methods of the theory; its basic tool is the differential calculus 
and it primarily investigates the “differential” properties of the curves 
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and surfaces, i.e., their properties “at a point.”* Thus, the direction of 
a curve at a point is determined by its tangent at that point and the 
amount by which it twists is described by its curvature (the exact définition 
of this term will be given below). Differential geometry investigates the 
properties of small segments of curves and surfaces and only in its later 
developments does it proceed to the study of their properties “in the 
large,” i.e., in their entire extern. 

The development of differential geometry is inseparably connected 
with the development of analysis. The basic operations of analysis, 
namely différentiation and intégration, hâve a direct géométrie meaning. 
As was mentioned in Chapter II, differentiating a function f(x) corresponds 
to drawing a tangent to the curve 


y =/(*)• 

The slope of the tangent line (i.e., 
the trigonométrie tangent of the 
angle it makes with the axis Ox) is 
precisely the dérivative /'(*) of the 
function J{x) at the corresponding 
point (figure 1), and the area “under 
the curve” 



y = /(*) Fig. 1. 

is precisely the intégral J* f{x) dx of 

this function, evaluated between the corresponding limits. Just as in analysis 
we investigate arbitrary functions, so in differential geometry we examine 
arbitrary curves and surfaces. In analysis, the first object of study is the 
general course of a curve on a plane, its rise and fait, its greater or smaller 
curvature, the direction of its convexity, its points of inflection, and so 
forth. The close connection between analysis and the curves is indicated 
by the name of the first textbook in analysis, by the French mathematician 
l’Hôpital in 1695: “Infinitésimal analysis applied to the study of curves.” 

By the middle of the 18th century, the differential and intégral calculus 
had been sufficiently developed by the immédiate successors of Newton 
and Leibnitz that the way was open for more profound applications to 
geometry. Indeed, it is only from this moment that one may properly 


* The properties of curves and surfaces “at a point" are those properties that dépend 
only on an arbitrarily small neighborhood of the point. Properties of this sort are 
defined in terms of the dérivatives (at the given point) of the functions occurring in the 
équations of the curve or surface. It is for this reason that differential geometry imposes 
conditions guaranteeing that the differential calculus is applicable: it is required that 
the curve or surface be defined by functions with a sufficient number of dérivatives. 


60 


VII. CURVES AND SURFACES 


speak of a theory of curves and surfaces. For surfaces, and for curves 
in space, the analogous problems are immeasurably richer in content 
than for plane curves, so that with the passage of time these problems 
outgrew the framework of a simple application of analysis to geometry 
and led to the formation of an independent theory. During the second 
half of the 18th century, many mathematicians shared in building up 
the éléments of this theory: Clairaut, Euler, Monge, and others, among 
whom Euler must be considered as the founder of the general theory 
of surfaces. The first comprehensive work on curves and surfaces was 
the book of Monge “Application of analysis to geometry,” published in 
1795.* From the investigations of these mathematicians, and, in particular, 
from the book of Monge, we can easily understand the upsurge of interest 
in differential geometry. This upsurge was due to the demands of me- 
chanics, physics, and astronomy, i.e., in the final analysis to the needs 
of technology and industry, for which the available results of elementary 
geometry were completely insufficient. 

The classical work of Gauss (1777-1855) in the theory of surfaces is 
also related to practical questions. His “General investigations concerning 
curved surfaces,” published in 1827, is basic for the differential geometry 
of surfaces as an independent branch of mathematics. His general methods 
and problems, discussed later in §4, originated to a great degree in the 
practical needs of map making. The problem of cartography consists of 
finding as exact a représentation as possible of parts of the surface of 
the earth on a plane. A completely exact représentation here is impossible, 
the mutual relations of various lengths being necessarily distorted because 
of the curvature of the earth. Thus one has the problem of finding the 
most nearly exact methods possible. The drawing of maps goes back to 
remote antiquity, but the création of a general theory is an achievement 
of recent times and would not hâve been possible without the general 
theory of surfaces and the general methods of mathematical analysis. 
We note that one of the difficult mathematical problems of cartography 
was investigated by P. L. Cebysev (1821-1894), who obtained important 
results relating to nets of curved Unes on surfaces. His investigations also 
arose from purely practical problems. 

The general questions of deforming one surface so that it can be mapped 
on another still constitute one of the main branches of geometry. Important 
results in this direction were obtained in 1838 by F. Minding (1806-1885), 
professor at the University of Dorpat (now Tartu). 


•Gaspard Monge (1746-1828) was not only an outstanding scientist but also an 
active French revolutionary (minister of naval affairs, and then director of the manu¬ 
facture of cannon and powder). He followed the path, characteristic of the French 
bourgeois of the time, from Jacobin to adhèrent of the emperor Napoléon. 
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By the second half of the last century, the theory of curves and surfaces 
was already well established in its basic features, provided we are speaking 
of “classical differential geometry” in contrast with the newer directions 
discussed later in §5. The basic équations in the theory of curves, namely 
the so-called Frenet formulas, had already been obtained, and in 1853 
K. M. Peterson (1828-1881), a student of Minding’s at Tartu University, 
discovered and investigated in his dissertation the basic équations of the 
theory of surfaces, rediscovered 15 years later and published by the 
Italian mathematician Codazzi, with whose name these équations are 
usually associated. Peterson, after graduating from the university at 
Tartu, lived and worked in Moscow, as a teacher in a gymnasium. 
Though he never held any academie position corresponding to his 
outstanding scientific achievements, he was nevertheless one of the 
founders of the Moscow Mathematical Society and of the journal 
“Matematiôeskil Sbornik,” published in Moscow from 1866 up to the 
présent day. The Moscow school of differential geometry begins with 
Peterson. 

The results to date of the “classical” differential geometry were sum- 
marized by the French geometer Darboux in his four-volume “Lectures 
on the general theory of surfaces,” issued from 1887 to 1896. In the 
présent century classical differential geometry continues to be studied, 
but the center of interest in curves and surfaces has largely shifted to 
new directions in which the class of figures under study has been even 
more widely extended. 


§2. The Theory of Curves 

Various methods of defining curves in differential geometry. From 
analysis and analytic geometry we are accustomed to the idea of defining 
curves by means of équations. In a rectangular coordinate System on the 
plane, a curve may be given either by the équation 

y = Âx), 

or by the more general équation 

F(x,y) = 0. 

However, this method of définition is suitable only for a plane curve, 
i.e., a line in the plane. We also require a method of writing équations 
of space curves not lying in any plane. An example of such a curve 
may be seen in the hélix (figure 2). 
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For the purposes ofdifferential geometry, and for manyother questions 

as well, it is most convenient to repre- 
sent a curve as the trace of a continuous 
motion of a point. Of course, the given 
curve may hâve originated in some 
entirely different way, but we can always 
think of it as the path of a point 
moving along it. 

Let us assume that we hâve a fixed 
Cartesian coordinate System in space. If 
a moving point X traces out a curve 
from time t — a to t = b, then the 
coordinates of this moving point are 
given by the functions of the time 
x(t), y(t), and z(r); the flight of an 
airplane or a projectile are examples. 
Conversely, if we are initially given 
the functions x(t), y(t), and z(r), we 
may let them define the coordinates of a moving point X , which traces 
out some curve. Consequently, curves in space may be given by three 
équations of the form 

* = *('). y = A'). 7. = z(/). 

In the same way a plane curve is defined by two équations 

x = x(t\ y - y(r). 

This is the most general manner of defining curves. 

As an example we consider the hélix. It is produced by the spiral 
motion of a point that revolves uniformly around a straight line, the axis 
of the hélix, and at the same time moves uniformly in a direction parallel 
to this axis. Let us take the axis of the hélix as the axis Oz and suppose 
that at time / = 0 the point lies on the axis Ox. We now wish to find 
how its coordinates dépend on the time. If the motion parallel to the 
axis Oz has velocity c, then obviously the distance travelled in this direction 
at time t will be 

z = ci. 

Also, if <f> is the angle of rotation around the axis Oz and a is the distance 
from the point to this axis, then, as can be seen in figure 2, 



x = a cos <f >, y = a sin <f>. 
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Since the rotation is uniform, the angle <j> is proportional to time; that is, 
<f> = uit , where œ is the angular velocity of the rotation. In this manner 
we get 

x = a cos c ut, y = a sin eu/, z = et. 

So these are the équations of the hélix, which as t changes will be traced 
out by the moving point. 

Of course the variable t or, as it is usually called, the parameter, need 
not be thought of as representing the time. Also, the given parameter / 
may be replaced by another; for example we may introduce a parameter u 
by the formula / = u 3 , or, in general, by / = /(«).* In geometry the most 
natural choice of parameter is the length s of the arc of the curve measured 
from some fixed point A on it. Every possible value of the length s 
represents a corresponding arc AX. Thus the position of X is fully 
determined by the value of s and the coordinates of the point X are given 
by the functions of arc length s 

x = x{s), y = ><j), r = z(.r). 

Ail these ways of defining curves, as well as other possible ones,* open 
up the possibility of numerical computation. Only when curves hâve 
been defined by équations can their properties be investigated by mathe- 
matical analysis. 

In the differential geometry of plane curves, there are three basic 
concepts: length, tangent, and curvature. For space curves, there are in 
addition the osculating plane and the torsion. We now proceed to explain 
the meaning and significance of these concepts. 

Length. Everyone has in mind a natural idea of what is meant by 
length, but this idea must be converted into an exact définition of the 
length of a mathematical curve, a définition with a spécifie numerical 
character, which will enable us to compute the length of a curve with 
any desired degree of accuracy and consequently to argue about lengths 
in a rigorous way. The same remarks apply to ail mathematical concepts. 
The transition from informai ideas to exact measurements and définitions 
represents the transition from a prescientific understanding of objects to 

* Here, strictly speaking, it is necessary that the function / be monotone. 

t A curve in space may also be given as the intersection of two surfaces, defined 
by the équations: F(x , y, z) - 0, G(x, y, z) = 0, i.e., the curve is given by this pair of 
équations. In theoretical discussions a curve is most frequently given by a variable 

vector, i.e., the position of the point X of the curve is defined by the vector r = OX, 
extending from the origin to this point. As the vector r changes, its end point X moves 
along the given curve (figure 3). 
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a scientific theory. The need for a précisé définition of length arose in 
the final analysis from the requirements of technology and the natural 
sciences, whose development demanded investigation of the properties of 
lengths, areas, and other géométrie entities. 



Fig. 3. Fig. 4. 


A simple and most useful définition of length is the following: The 
length of a curve is the limit of the length of broken fines inscribed in 
the curve under the condition that their vertices cluster doser and doser 
together on the curve. 

This définition arises naturally from our everyday methods of measuring. 
On the curve we take a sequence of points A 0 , A ,, A t , ■ ■ • (figure 4) 
and measure the distances between them. The sum of these distances 
(which is the length of the broken fine) expresses approximately the 
length of the curve. In order to define the length more exactly, it is natural 
to take the points A doser together, so that the broken fine follows 
the twists of the curve more closely. Finally, the exact value of the length 
is defined as the limit of these approximations as the points A are chosen 
arbitrarily close together.* Thus the earlier définition of length is a 
generalization, based on taking finer and finer steps, of a completely 
pradical manner of measuring length. 

From this définition of length, it is easy to dérivé a formula for com¬ 
puting lengths when the curve is given analytically. We note, however, 
that mathematical formulas are useful for more than just computation. 

* The existence of the indicated limit, i.e., the length of the curve, is not initially 
clear, even for curves lying in a bounded domain. If the curve is very twisted, its length 
may be very great, and it is possible mathematically to construct a plane curve which 
is so “twisted” that none of its arcs has a finite length since the lengths of broken lines 
inscribed in it increase beyond ail bounds. 
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They are a brief statement of theorems that establish connections between 
different mathematical entities. The theoretical significance of such 
connections may far ex- 
ceed the computational 
value of the formula. For 
example, the importance 
of the Pythagorean theo- 
rem, expressed by the 
formula 

c* = a* + b\ 

is not confined to the 
computation of the 
square of the hypoténuse 
c but lies chiefly in the 
fact that it expresses a 
relation among the sides of a right triangle. 

Let us now introduce a formula for the length of a plane curve, given 
in Cartesian coordinates by the équation y = /(x), assuming that the 
function /(x) has a first dérivative. 

We inscribe a broken line in the curve (figure 5). Let A„, A n+l be two 
of its adjacent vertices with coordinates x„ , y„ and x n+1 , y n+ ,. The line 
segment A„A n+1 is the hypoténuse of a right triangle the legs of which 
are equal to 

Ax n = I x„ +1 — x n |, Ay n = | y n+l — y n |. 

Thus, by the Pythagorean theorem, 

A„A n+ i = V(Jx„)* + (Ay„Y = yj 1 + (-^-) Ax„ . 

It is easy to see that if the straight line drawn through the points A„ 
and A„ +i is translated parallel to itself, then at the instant when the line 
leaves the curve it will assume the position of a tangent to this curve 
at some point B, i.e., on the arc of the curve A„A„ +1 , there is at least 
one point at which the tangent has the same direction as the chord 
A„A n+ , . (This obvious conclusion can easily be given a rigorous proof.) 

Thus we may replace the ratio AyjAx„ by the slope of the tangent 
at B , i.e., by the dérivative /(£„). where is the abscissa of the point B. 
Now the length of one link of the broken line is expressed by 



Fig. 5. 


A„A„ tl = V 1 +y'\L)Ax n . 
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The entire length of the broken line is the sum of the lengths of its pièces. 
Denoting the addition by the symbol L, we hâve 


Su = X Vl +y' 2 (L)Ax„. 


To obtain the length of the curve, we must pass to the limit under 
the condition that the greatest of the values Ax„ tends to zéro, 

s = 2) vTTTWJ* • 

d *-»0 “ 

But this limit is exactly th e intég ral defined in Chapter II, namely the 
intégral of the function V 1 + y'*. Thus the length of a plane curve is 
expressed by the formula 

$ = f y/i +y'*dx, (i) 

J a 

where the limits of intégration a and b are the values of x at the ends 
of the arc of the curve. 

The corresponding, but somewhat different, formula for the length of 
a space curve is derived in basically the same way. 

The actual computation of a length by means of these formulas is, 
of course, not always simple. Thus the calculation of the circumference 
of a circle from formula (1) is rather complicated. However, as we hâve 
said, the interest of formulas is not confined to computation; in particular, 
formula (1) is also important for investigating the general properties of 
length, its relations with other concepts, and so forth. We will hâve an 
opportunity to make use of formula (1) in Chapter VIII. 

Tangent. The tangent to a plane curve was already considered in 
Chapter II. Its meaning for a space curve is completely analogous. In 
order to define the tangent at a point A, we choose a point X on the 
curve, distinct from A, and consider the sécant AX. Then we allow X to 
approach A along the curve. If the sécant AX converges to some limiting 
position, then the straight line in this limiting position is called the tangent 
at the point A.* 

If we distinguish between the initial point and the end point of the 
curve and thereby establish an order in which the points of the curve 

* The limiting position of the sécant may not exist, as can be seen from the examplc 
in figure 13, Chapter II. The curve represented by y = x sin l/x oscillâtes near zéro 
in such a way that the sécant OA, as A approaches O, constantly oscillâtes between 
the straight fines OM and OL. 
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are traversed, then we may say which of the points A and X cornes first 
and which cornes second. (For example, if a train travels from Moscow 
to Vladivostok, then Omsk obviously précédés Irkutsk.) So we may 
define a direction along the sécant from the first point to the second. 
The limit of such “directed sécants” gives us a “directed tangent.” In 
figure 6, the arrow shows the direction in which the point A is passed 
through. For the motion of a point along the curve, the velocity at each 
instant is directed along the tangent to the curve. 




The tangent has an important géométrie property: Near the point of 
tangency the curve départs less, in a well-defined sense, from this straight 
line than from any other. In other words, the distance from the points 
of the curve to the tangent is very small in comparison with their distance 
to the point of tangency. More precisely, the ratio XX'/AX (figure 7) 
tends to zéro as X approaches A.* So a small segment of the curve may 
be replaced by a corresponding segment of the tangent with an error 
that is small in comparison with length of the segment. This procedure 
often allows us to simplify proofs, since in a passage to the limit it gives 
completely exact results. 

It is interesting to observe that for a curve which is not a straight line, 
i.e., does not hâve a direction in the elementary sense, we hâve been 
able, by associating it with a straight line, to define its direction at each 
point. Thus the concept of direction has been extended; it has been given 
a meaning which it did not previously hâve. This new concept of direction 
reflects the actual nature of motion along a curve; at each instant the 
point is moving in some definite direction, which changes continuously. 


* This resuit follows immediately from the définition of the tangent itself. Evidently, 
as is shown in figure 7, XX'/AX — sin a, where a is the angle between the tangent and 
the sécant AX. Thus, as a -* 0, XX'IAX also tends to zéro. 
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Curvature. To be able to judge by eye whether a path, a thin rod, 



or a line in a drawing is more or 
less curved it is not necessary to 
be a mathematician. But for even 
the simplest problems of me- 
chanics, a casual glance is not 
sufficient; we need an exact quan¬ 
titative description of the curva¬ 
ture. This is obtained by giving 
précisé expression to our intuitive 
impression of the curvature as 
the rapidity of change of direction 


of the curve. 

Let A be a point on the curve and M a point near A (figure 8). The 
angle between the tangents at these points expresses how much the curve 
has changed direction in the segment from A to M. Let us dénoté this 
angle by <f>. The average rate of change of direction (more precisely, the 
average change per unit length of path along the segment A Moi length As) 
will obviously be <f>/As. Then the curvature, namely the rate of change 
of direction of the curve at the point A itself, is naturally defined as the 
limit of the ratio <f>/As as M -*■ A ; in other words, as As -*■ 0. Thus the 
curvature is defined by the formula 


k = lim -j-. 
a»-*o 


As a particular example, let us consider the curvature of the circum 


ference of a circle (figure 9). 
Obviously, the angle <f> between 
the radii OA and OM is equal 
to the angle <f> between the 
tangents at the points A and M, 
since the tangents are perpen- 
dicular to the radii. The arc 
AM, subtending the angle 4>, 
has length As — <f>r, so that 

As r ' 


A 



This means that the ratio <f>/As 

is constant, so that the curvature of the circumference of a circle, as the 
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limiting value of this ratio, is equal at ali points to the reciprocal of the 
radius.* 

Let us dérivé the formula for the curvature of a plane curve given by 
the équation y = f(x). As the initial point for arc length we take a fixed 
point N (figure 10). The angle <f> between the tangents at the points A 



and M is obviously equal to the différence in the angle of inclination of 
the tangents at A to M. 

<f> = | A<x |. 


Since the angle a may decrease, we take the absolute value | A a |. 
We are interested in the value 


• 

k = lim -r— 
a»-*o As 


= lim 

à $-»0 


I 

As 


I A<x\ 


lim 

J*--0 


Ax 

As_ 

Ax 



The length of the arc of the curve NA is expressed by the intégral 


s = C Vi + y* dx, 

^ a 

so that 

s' = V 1 + y 2 . 


* Wc note that in general the concept of the curvature of a curve at a point may bc 
dcfincd by comparing the curve with the circumference of a certain circle, which plays 
the rôle of a model or standard for the curvature. For in fact, the curvature of the 
given curve proves to be equal to the reciprocal of the radius of the (unique) circle 
which fits the curve most closely in the neighborhood of the point. 
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It remains to find a'. We know that tan a -y': thus a = arc tan/. 
Differentiating this last équation with respect to x, we get 


Thus, finally 


a' 



l«l = \y"\ 

s‘ “ (1 + y‘ 2 Y' 2 ' 


The corresponding formulas for other methods of representing plane and 
space curves are given in the usual courses in analysis or differential 

geometry. 

This formula allows us to give another 
géométrie interprétation of curvature, which 
is useful in many questions. Namely, the 
curvature of a curve at a point is given by 
the formula 

i r 2h 
Ar = hm — , 
t-o F 



where h is the distance of a second point on the curve to the tangent 
at the given point and lis the length of the segment of the tangent between 
the point of tangency and the projection on the tangent of the other 
point on the curve (figure II). 

To prove this we choose a rectangular coordinate System such that 
the origin faits at the given point of the curve and the axis Ox is tangent 
to the curve at this point (figure II). (For simplicity we assume that the 
curve is plane.) Then ÿ = 0 and k = | y" |. Expanding the function 
y = /(x) by Taylor’s formula, we get y = ^y’x 2 -F ex 2 (where we hâve 
taken into account that ÿ = 0). Here e -+ 0 as x -* 0. Hence it follows 
that k = | y" \ = lim,_ 0 2 | y |/x*, and thus, since | y | = h, x 2 = /*, we 
hâve 


i r 2h 
k = lim — . 
i-o / 2 


This formula shows that the curvature describes the rate at which the 
curve leaves the tangent. 

Let us now turn to some very important applications of curvature to 
problems of mechanics. 

First we consider the following problem. Let a flexible string be 
stretched over a support (figure 12) in such a way that the string remains 
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in one plane. We wish to find the pressure of the string on the support 
at every point, or to be more exact, to define the limit 


P = 


.. P 
hm — , 
j»-*o As 


( 2 ) 


where P is the magnitude of the force P acting on the support along 
a piece of length As containing the given point. We assume for simplicity 
that the magnitude T of the tension T is the same at ail points of the 
string. 


T 




Now consider the point A and a segment of the string AB.* On this 
segment AB of length As, in addition to the reaction of the support, 
only two external forces are acting, namely the tensions at the ends, 
which are equal in magnitude and are directed along the tangents at the 
ends of the segment. Thus the force P exerted by the string on the support 
is equal to the géométrie sum of the tensions at the ends. As can be 
seen from figure 12, the vector P is the base AD of the iscosceles triangle 
CAD. The two equal sides of this triangle hâve length T and the angle 
at the vertex C is equal to the change of direction of the tangent in passing 
from A to B. 

With decreasing As the angle <f> decreases and the angle between P 
and the tangent at the point A approaches a right angle. Thus the pressure 
is perpendicular to the tangent. 

To find the magnitude of the pressure, we make use of the fact that 
a small arc of the circumference has approximately the same length as 


* It would be more natural to choose a segment with the point A in its interior; 
this would not change the resuit but would make the computation somewhat more 
complicated. 
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the chord subtending it. Thus we replace the length of the chord AD, 
i.e., the magnitude P, by the length T<f> of the arc AD. Then by formula (2) 
we get 

P Té, d, 

p — lim -j— = lim —r— = T lim -r— = Tk. 
a»--o As <s«-*o As <3*-o As 

Hence the pressure at each point is equal to the product of the curvature 
and the tension on the string and is exerted perpendicularly to the tangent 
at this point. 

Consider a second problem. Let a mathematical point (i.e., a very 
small body) move along a plane curve with a velocity of constant mag¬ 
nitude v. What is its accélération at a given point A ? By définition, the 
accélération is equal to the limit of the ratio of the change in velocity 
(during the time At) to the incrément At of the time. The velocity involves 
not only magnitude but also direction, i.e., we consider the change in 
the velocity vector. Therefore the mathematical problem of finding the 
magnitude of the accélération consists of finding the limit 

» = lim + f> . ~ “M J, 
a<- o At 

where v(t) is the velocity at the point A itself, and | v(t + At) — v(t )| is 
the length of the vector différence of the velocities. The limit which 
concerns us may also be represented as 

lim I - -«<<> + lim 

As Jf-o At 


'(t) 



Fig. 13. 
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where As is the length of the arc AB traversed during time At. Turning 
to figure 13 and noting that the velocity at each point is directed along 
the tangent while remaining constant in magnitude, we see geometrically 
that finding the sum — v(t) + v(t + At) is identical with finding the 
vector P in the preceding problem. So we may avail ourselves of the 
resuit there and, replacing tension by velocity, write 

Hm I -<■> + Kt + A) I _ vh 

J>-0 J* 

Moreover, lim J( _ 0 As/At = v. So we hâve the final resuit that the 
accélération of a body in uniform motion along the curve is equal to 
the product of the curvature and the square of the velocity 

h- = kv* (3) 

and is directed along the normal to the curve, i.e., along a straight line 
perpendicular to the tangent. 

Our recourse here to a géométrie analogy, enabling us to use the 
solution of the problem of the pressure exerted by a string in order to 
solve a problem of the accélération of a particle, shows once again how 
useful it is to make an abstraction from the particular concrète properties 
of a phenomenon to corresponding mathematical concepts and results; 
for we can then make use of these results in the most varied situations. 

We also note that the curvature, which from a mechanical point of 
view reflects the change in the direction of motion, is seen to be closely 
connected with the forces causing this change. The équation which 
expresses this connection is easily derived if we multiply équation (3) 
by the mass m of the moving point. We hâve 

F n = mw = v 2 mk. 

Here F„ is the magnitude of the normal component of the force acting 
on the point. 

Osculating plane. Although a space curve does not lie in one plane, 
still with each point A of the curve it is possible, as a rule, to associate 
a plane P which in the neighborhood of this point lies doser to the curve 
than any other plane. This plane is called the osculating plane of the 
curve at the point. 

Naturally the osculating plane, as the plane closest to the given curve, 
passes through the point A and contains the tangent T to the curve. 
But there are many planes containing the point A and the straight line T. 
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In order to choose from among them the one plane that least deviates 
from the curve, we investigate the déviation of the curve from the tangent. 
For this purpose let us see how the curve runs along the tangent T-, 
in other words, let us project our curve onto the normal plane Q, which 

is perpendicular to T at the point A 
(figure 14). The projection on the 
plane Q of a segment of our curve 
containing A forms a new curve, 
indicated in figure 14 by a dotted 
line. Usually it has a cusp at the 
point A. If the curve so obtained 
has a tangent N at the point A, then 
the plane P determined by T and N 
will naturally be closest to the 
original curve in the neighborhood 
of the point A, i.e., it will be the 
osculating plane at the point A. It may be shown that when the functions 
defining the original curve hâve second dérivatives and the curvature of 
the curve at the point A is not zéro, then the osculating plane necessarily 
exists, and its équation may be expressed very simply in terms of the 
first and second dérivatives of the functions defining the curve. 

We saw earlier that the properties of the tangent allow us to consider 
a small segment of a plane curve as though it were straight, thereby 
making an error which is small in comparison with the length of the 
segment; similarly the properties of the osculating plane allow us to 
consider a small segment of a space curve as though it were a plane 
curve, namely its projection on the osculating plane, and here the error 
will be small in comparison with the square of the length of the segment 
of the curve. 

There are many straight Unes in space that are perpendicular to the 
tangent; they form the normal plane at the given point of the curve. 
Among these straight fines there is one, the fine N, which fies in the 
osculating plane. This fine is called the principal normal to the curve. 
Usually we also fix a direction for it, namely the direction of the con- 
cavity of the projection of the curve on the osculating plane. The principal 
normal plays the same rôle for a space curve as the ordinary (unique) 
normal for a plane curve. In particular, if a thin string under tension T 
is stretched in the form of a space curve over a support, then the pressure 
of the string on the support has at each point the magnitude Tk and is 
directed along the principal normal. If a material point is moving along 
a space curve with a velocity of constant magnitude v, then its accélération 
is equal to kiP and is directed along the principal normal. 
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Torsion. From point to point along a curve the position of the 
osculating plane will probably change. Just as the rate of change of 
direction of the tangent characterized the curvature, so the rate of change 
of direction of the osculating plane characterizes a new quantity, the 
torsion of the curve. Here, as in the case of curvature, the rate is taken 
with respect to arc length; that is, if <p is the angle between the osculating 
planes at a fixed point A and at a nearby point X, and if As is the length 
of the arc AX, then the torsion r at the point A is defined as the limit* 

J, 

t = lim — . 
j«-.o As 

The sign of the torsion dépends on the side of the curve toward which 
the osculating plane turns as it moves along the curve. 

We may imagine the osculating curve as the blade of a fan with the 
two lines, the tangent and the principal normal, drawn on it. At each 
moment the tangent is turning in the direction of the normal at a rate 
determined by the curvature, while the osculating plane rotâtes around 
the tangent with a speed and direction determined by the torsion. 

The simplest results of the theory of differential équations may be used 
to prove a fundamental theorem that States, roughly speaking, that two 
curves with the same curvature and the same torsion are identical with 
each other. Let us make this idea clearer. If we move along the curve 
to various distances A from our initial point, we will arrive at points 
where the curvature k and the torsion t will hâve various values, depending 
on s. Thus k(s) and t(s) will be certain well-defined functions of the arc 
length s. 

The theorem in question States that if two curves hâve identical 
curvature and torsion as functions of arc length, then the curves are 
identical (i.e., one of them may be rigidly moved so as to coïncide with 
the other). In this manner curvature and torsion as functions of arc 
length define a curve completely except for its position in space; they 
describe ail the properties of the curve by stating the relationship between 
its length, its curvature, and its torsion. In this way the three concepts 
constitute a sort of ultimate basis for questions conceming curves. With 
their help we can also express the simplest concepts in the theory of 
surfaces, to which we now turn. 


* It may be shown that a hélix has the same torsion at ail its points and consequently 
that we may define the torsion of a curve by comparing the curve with the (unique) 
hélix which best approximates the curve in the neighborhood of the given point. The 
torsion also characterizes the way in which a given space curve differs from a plane 
curve. With a certain analogy to curvature, it characterizes the rate at which the curve 
leaves its osculating plane. 
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Of course, the theory of curves has not been exhausted by our présent 
remarks. There are many other concepts relating to curves: spécial types 
of curves, families of curves, the position of curves on surfaces, questions 
of the form of a curve as a whole, etc. These questions and the methods 
of answering them are connected with almost every branch of mathe- 
matics. The range of problems that may be solved by the theory of 
curves is extremely rich and varied. 


§3. Basic Concepts in the Theory of Surfaces 

The basic methods of defining a surface. If we wish to study surfaces 
by means of analysis we must, of course, define them analytically. The 
simplest way is by an équation 

* =f(x,y), 

in which x, y, and z are Cartesian coordinates of a point lying on the 
surface. Here the function f(x, y) need not necessarily be defined for ail 
x, y, its domain may hâve various shapes. Thus, the surface illustrated 
in figure 15 is given by the function /(x, y) defined inside an annulus. 
Examples of surfaces given by équations of the form z = f(x, y) are also 
familiar from analytic geometry. We know, for example, that the équation 
z = Ax 4- By + C represents a plane, and z = x 1 + / a paraboloid of 
révolution (figure 16). For the application of differential calculus it is 
necessary that the function /(x, y) hâve first, second, and sometimes even 
higher dérivatives. A surface given by such 
an équation is called regular. Geometrically 
this means (though not quite precisely) that ! 



Fig. 15. 


Fig. 16. 
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the surface curves continuously without breaks or other singularités. 
Surfaces that do not hâve this property, for example, those with cusps, 
breaks, or other singularités, require a new kind of investigation (cf. §5). 

However, not every surface, even without singularités, can be entirely 
represented by an équation of the form z = /(x, y). If every pair of 
values of x, y in the domain of /(x, y) gives a completely determined z, 
then every straight line parallel to the axis Oz must intersect the surface 
at no more than one point (figure 17). Even such simple surfaces as 



spheres or cylinders cannot be represented in the large by an équation 
of the form z = f(x, y). In these cases the surface is defined in some 
other manner, for example by an équation of the form F(x, y, z) = 0. 
Thus a sphère of radius R with center at the origin has the équation 

x* + / + z* = R\ 

The équation x 2 + y 1 = r 2 gives a cylinder of radius r. 

So when the investigation is concerned only with small segments of the 
surface, as is usually the case in classical diflferential geometry, the 
définition of a surface by an équation z = f{x, y) is perfectly general, 
since every sufficiently small segment of a smooth surface can be rep¬ 
resented in this form. We take this way as basic, and leave other methods 
of defining surfaces to be considered later in §§4 and 5. 

Tangent plane. Just as at each point a smooth curve has a tangent 
line which is close to the curve in a neighborhood of the point, so also 
surfaces may hâve, at each of their points, a tangent plane. 
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The exact définition is as follows. A plane P, passingthrough a point M 

on a surface F, is said to 
be tangent to the surface 
Fat this point if the angle 
a bet ween the plane F and 
the sécant MX, drawn 
from M to a point X of 
the surface, converges to 
zéro as the point X ap- 
proaches the point M 
(figure 18). Ail tangents 
to curves passing through 
the point M and lying 
on the surface obviously lie in the tangent plane. 

A surface F is called smooth if it has a tangent plane at each point 
and if, as we pass from point to point, the position of this plane varies 
continuously. 

Near the point of tangency, the surface départs very little from its 
tangent plane: If the point X approaches the point M along the surface, 
then the distance of the point X from the tangent plane becomes smaller 
and smaller, even in comparison with its distance from the point M 
(the reader can easily verify this by considering how X approaches M 
in figure 18). In this way, the surface near the point M may be said to 
merge into the tangent plane. In the first approximation a small segment 
or, as it is called, an “element” of the surface may be replaced by a 
segment of the tangent plane. The perpendicular to the tangent plane 
which passes through the point of tangency acts as a perpendicular to 
the surface at this point and is called a normal. 

This possibility of replacing an element of the surface by a segment 
of the tangent plane is useful in many situations. For example, the 
reflection of light on a curved surface takes place in the same way as 
the reflection on a plane, i.e., the direction of the reflected ray is defined 
by the usual law of reflection: The incident ray and the reflected ray 
lie in one plane together with the normal to the surface and they make 
equal angles with this normal (figure 19), just as if the reflection were 
occurring in the tangent plane. Similarly for the refraction of light in 
a curved surface, each ray is refracted by an element of the surface with 
the usual law of refraction, just as if the element were plane. These facts 
are the basis for ali calculations of reflection and refraction of light in 
optical apparatus. Further, for example, solid bodies in contact with 
each other hâve a common tangent plane at their point of contact. The 
bodies are in contact over an element of their surface, and the pressure 
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of one body on the other, in the absence of friction, is directed along 
the normal at the point of contact. This is also true when the bodies 
are tangent at more than one point, in which case the pressure is directed 
along the respective normals at each point of contact. 



The replacement of éléments of a surface by segments of the tangent 
planes can also serve as the basis of a définition of the area of various 
surfaces. The surface is decomposed into small pièces F,, F„, •••, F„ and 
each piece is projected onto a plane tangent to the surface at some point 
of this piece (figure 20). We thus obtain a number of plane régions 
F,, Fj, •••, P n , the sum of whose areas gives an approximation to the 



Fig. 20. 
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area of the surfaces. The area of the surface itself is defined as the limit 
of the sums of the areas of the segments />,, P ï , •••, P r under the con¬ 
dition that the partitions of the surface become finer.* From this we can 
dérivé an exact expression for the area in the form of a double intégral. 

These remarks clearly demonstrate the significance of the concept of 
the tangent plane. However, in many questions the approximate représen¬ 
tation of an element of a surface by means of a plane is inadéquate and 
it is necessary to consider the curvature of the surface. 

Curvature of curves on a surface. The curvature of a surface at a 
given point is characterized by the rate at which the surface leaves its 
tangent plane. But in different directions, the surface may leave its tangent 
plane at different rates. Thus the surface illustrated in figure 21 leaves 
the plane P in the direction OA at a faster rate than in the direction OB. 
So it is natural to define the curvature of a surface at a given point by 
means of the set of curvatures of ail curves lying in the surface and 
passing through the given point in different directions. 



Fio. 21. Fig. 22. 


This is done as follows. We construct the tangent plane P through 
the point M and choose a spécifie direction for the normal (figure 22). 
Then we consider curves which are sections of the surface eut by planes 
passing through the normal at the point M\ these curves are called 
normal sections. The curvature of a normal section is given a sign, which 
is plus if the section is concave in the direction of the normal and minus 
if it is concave in the opposite direction. Thus, in a surface which is 
saddle-shaped, as illustrated in figure 23 with the arrow indicating the 


This is exactly the expression for the area which was used in §1, Chapter Vlft. 
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direction of the normal to the surface, the curvature of the section MA 
is positive and that of the section MB is négative. 

A normal section is defined by the angle <f> by which its plane is rotated 
from some initial ray in the tangent plane (figure 22). If we know the 
curvature of the normal section k(<f>) in terms of the angle <f>, we will 
hâve a rather complété picture of the behavior of the surface in the 
vicinity of the point M. 

A" surface may be curved in many different ways and thus it would 
appear that the dependence of the curvature k on the angle <{> may be 
arbitrary. In fact this is not so. For the surfaces studied in differential 
geometry, there exists a simple law, due to Euler, that establishes the 
connection between the curvatures of the normal sections passing through 
a given point in various directions. 

It is shown that at each point of a surface there exist two particular 
directions such that 

1. They are mutually perpendicular; 

2. The curvatures A:, and k 2 of the normal sections in these directions 
are the smallest and largest values of the curvatures of ail normal sections;* 

3. The curvature k(<f>) of the normal section rotated from the section 
with curvature k t by the angle <f> is expressed by the formula 

k(<f>) = A:, cos* <f> + k 2 sin* <f>. (4) 

Such directions are called the principal directions and the curvatures 

and k 2 are called the principal curvatures of the surface at the given 
point. 

This theorem of Euler shows that in spite of the diversity of surfaces, 
their form in the neighborhood of each point must be one of a very 
few completely defined types, with an accuracy to within magnitudes of 
the second order of smallness in comparison with the distance from the 
given point. In fact, if A:, and k 2 hâve the same sign, then the sign of 
k(<f>) is constant and the surface near the point has the form illustrated 
in figure 22. If A:, and k 2 hâve opposite signs, for example A:, > 0 and 
k 2 < 0, then the curvature of the normal section obviously changes sign. 
This is seen from the fact that for <f> = 0 the curvature A: = A:, > 0 and 
for <f> = n 12 we hâve k = k 2 < 0. 

From formula (4) for k(<f>), it is not difficult to prove that as <f> changes 


*In the particular case *, = k t the curvature of ail sections is the same; as, for 
example, on a sphere. 
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from 0 to -n the sign of k(<f>) changes twice,* so that near the point the 
surface has a saddle-shaped form (figure 23). 

When one of the numbers k t and is equal to zéro, the curvature 
always has the same sign, except for the one value of <£, for which it 
vanishes. This occurs, for example, for every point on a cylinder (figure 24). 



In the general case the surface near such point has a form close to that 
of a cylinder. 


Finally, for A:, — k t = 0 ail normal sections hâve zéro curvature. 
Near such a point the surface is especially “close” to its tangent plane. 
Such points are called fiat points. One example of such a point is given 
in figure 25 (the point M). The properties of a surface near a fiat point 



Fig. 25. 


Fig. 26. 


*It is a simple matter to show that k(<f>) = k, cos 2 j + k, sin 8 ÿ vanishes for 
^ = arc tan V-kJk t and <f> — ir — arc tan V—k,jk t , changing sign the first timc 
from plus to minus and the second from minus to plus. 
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Let us now consider a section of the surface eut by an arbitrary plane Q 
(figure 26) not passing through the normal. The curvature k L of such a 
curve L, as Meusnier showed,* is connected by a simple relation with 
the curvature k N of the normal section in the same direction, i.e., the 
one that intersects the tangent plane in the same straight line. This 
connection is expressed by the formula 



where 9 is the angle between the normal and the plane Q. The correctness 
of this formula may be visualized very conveniently on a sphere. 

Finally, the curvature of any curve lying in the surface and having 
the plane Q as its osculating plane may be shown to be identical with 
the curvature of the intersection of Q with the surface. 

Thus, if we know A:, and A: 2 , the curvature of any curve in the surface 
is defined by the direction of its tangent and the angle between its oscu¬ 
lating plane and the normal to the surface. Consequently, the character 
of the curvature of a surface at a given point is defined by the two numbers 
/c, and kt. Their absolute values are equal to the curvatures of two 
mutually perpendicular normal sections, and their signs show the direction 
of the concavity of the respective normal sections with respect to a chosen 
direction on the normal. 

Let us now prove the theorems of Euler and Meusnier mentioned earlier. 

1. For the proof of Euler’s theorem we need the following lemma. 
If the fonction/tx, y) has continuous second dérivatives at a given point, 
then the coordinate axes may be rotated through an angle a such that 
in the new coordinate System the mixed dérivative f x , y , will be equal to 
zéro at this point, t We recall that after rotation of axes the new variables 
x', y are connected with x and y by the formulas 

x = x' cos a — ÿ sin a; y = x' sin a + y' cos a 

(cf. Chapter III, §7). For the proof of the lemma we note that 

dx dy , dx . dy 

— = cos a, — = sin a, — = —sin a, — == cos oc. 
dx dx dy' dy 


* Meusnier (1754-1793) was a French mathematician, a student of Monge; he was 
a general in the revolutionary army and died of wounds received in battle. 

t We will dénoté partial dérivatives by subscripts; for example, in place of df/dx 
we write f x , in place of d*fdy 2 we Write /„, etc. 
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Computing the dérivative by the chain rule, we arrive after some 
calculation at the resuit 

fx'v = fxy cos la + £(/„„ — /«) sin la, 

from which it readily follows that for 


we will hâve 


cot 2a = 

^ Jxy 


Uy. = 0 . 


We now consider the surface F, given by the équation z = /( x, y), in 
which the origin is at the point M under considération and the axes Ox 
and Oy are so chosen in the tangent plane that f xt (0, 0) = 0. In the 
surface P we take an arbitrary straight line making an angle <f> with the 



axis Ox and consider the normal section L in the direction of this straight 
line (figure 27). From the formula derived in §2, the curvature of L at 
the point M , taking its sign into account, is equal to 


^ L 


= lim 
{-0 


2/(*. y) 

f 2 


Her tj\x,y) is the distance (again taking its sign into account) of a point 
on L to the chosen straight line. Expanding f(x, y) by Taylor’s formula 
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(Chapter II, §9) and noting that fJO, 0) = /„(0, 0) = 0 (since the axes 
Ox and Oy lie in the tangent plane) we get 

A x > y) = i if *** 2 +/»»/) + «(** + 

where e -* 0 as x -*• 0, y -*• 0. For a point on L, we hâve x = $ cos if>, 
y = f sin (f>, f* = x 2 + y 1 (figure 27), and thus 


, .. fxx?cos*<t> +/ v „f 2 sin 2 <£ + 2ef* 

Kl — - 77 - 

{-0 P 


Putting <f> = 0, <}> = n/2, we find that /„ and f yv are the curvatures A:, and 
ki of the normal sections in the direction of the axes Ox and Oy. Thus the 
formula derived is actually Euler’s formula: k — k x cos 2 <f> + k t sin* <f>. 
The fact that A:, and k t are the maximal and minimal curvatures also 
follows from this formula. 



2. For the proof of Meusnier’s theorem we consider a normal section 
L N and a section L whose plane forms an angle 0 with the plane of the 
section L N , as in figure 28. The axes Ox and Oy lie in the tangent plane, 
and we also take the axis Ox to be tangent to the curves L N and L at 
the origin. The distance h(x, ÿ) to the Ox axis of a point A" on L with 
coordinates x, y, f(x,ÿ) is obviously equal to h(x, ÿ) = \f(x, y)|/cos 6 
(figure 28). Using Taylor’s formula, we express the curvature k L of the 
curve L in the following manner: 


k L 


= lim 

x-*a 


2h(x, y) 
x 2 


= lim 2 

l -*0 


I Ax,y)\ 
x 2 cos 9 


|im IA*-* 2 + 2 f Ty xy +f y y 4- 2((x 2 + f) | 

x-^t 


x 2 cos 9 


( 5 ). 
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where e -» 0 as x, y -*■ 0. Since the axis Ox is tangent to the curve L, 
obviously lim*,*, y/* = 0. Thus, taking the limit in formula (5), we get 



But for the chosen coordinate System the curve L N has the équation 
z = f(x, 0), for which | k N I = \f rJ I. Thus k L — \k N |/cos 8 and 
Meusnier’s theorem is proved. 

Mean curvature. In many questions of the theory of surfaces, the 
most important rôle is played not by the principal curvatures themselves 
but by certain quantifies dépendent on them, namely the mean curvature 
and the Gaussian or total curvature of the surface at a given point. Let 
us examine them in detail. 

The mean curvature of a surface at a given point is the average of the 
principal curvatures 

*.v= *(*,+*,). 

As an example of the usefulness of this concept, we consider the 
following mechanical problem. We assume that over the surface of some 
body F there is stretched a taut elastic rubber film. We ask about the 

pressure exerted by this film on 
each point of the surface of F. 

The pressure at a point M is 
measured by the force exerted by 
the film on a segment of the 
surface of unit area containing the 
point M; to be more exact, the 
pressure “at the point” M is 
defined as the limit of the ratio 
of this force to the area of the 
segment as the latter shrinks to the 
point M. 

We surround the point M on the surface with a small curvilinear 
rectangle whose sides hâve lengths Js, and ds 2 and are perpendicular 
to the first and second principal directions at M (figure 29).* On each 
side of the rectangle there is exerted a force that is proportional (from 
the assumed uniformity of the tension) to the length of the side and 
the tension T acting on the film. Thus, on the sides perpendicular to 


* Our reasoning here is not rigorous. However, by making estimâtes of the errors 
introduced, it is possible to give a rigorous proof of the resuit. 
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the first principal direction, there are exerted forces that are approximately 
equal to TAs t and hâve the direction of the tangent to the surface. 
Similarly, forces equal to TAs t act on the other pair of sides of the 
rectangle. In order to find the pressure at the point M, we must divide 
the résultant of these four forces by the area of the rectangle (approxi¬ 
mately equal to ^5, AsJ and pass to the limit for As 1 , As t -»• 0. Let us 
begin by dividing the résultant of the first two forces by ^Js, As 2 . 

If we examine the rectangle from its side (figure 30), we see that these 
forces are directed 
aiong tangents to the 
curve of the first nor¬ 
mal section and that 
the distance between 
their points of appli¬ 
cation is exactly Asj. 

So we hâve the same 
problem here as in §2 
for the pressure of a 
string on a support. Fig. 30. 

Using the earlier resuit, 

we find that the desired limit is equal to k t T, where k t is the curvature 
of the first normal section. With a similar expression for the other two 
forces, we obtain the formula: 

P m = T\k, + k t ) = 2TK„ . 

This resuit has many important conséquences. Let us consider an 
example. 

It is known that the surface film of a liquid is under a tension that 
is the same in ail directions on the surface. For a mass of liquid bounded 
by a curved surface, this tension, by the previous resuit, exerts a pressure 
on the surface which is proportional to its mean curvature at the given 
point. 

So in drops of very small diameter the pressures are very large, a fact 
that hinders the formation of such drops. In a cooling vapor the drops 
begin to form, as a rule, around specks of dust and around charged 
particles. In a completely pure, slightly cooled vapor, the formation of 
drops is delayed. But if, for example, a particle passes through the vapor 
at high speed, causing ionization of the molécules, then around the ions 
formed in its path there will momentarily appear small drops of vapor, 
constituting a visible track of the particle. This is the basis for con¬ 
struction of the Wilson chamber, widely used in nuclear physics for 
observing the motions of various charged particles. 
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Since the pressure exerted by a iiquid is the same in ail directions, 
a drop of Iiquid in the absence of other sources of pressure must assume 
a form for which at ail points of the surface the mean curvature is the 
same. In the experiment of Plateau, we take two liquids of the same 
spécifie weight, so that a clôt of one of them will float in equilibrium 
in the other. It may be assumed that the floating Iiquid is acted on only 
by surface tension,* and it turns out that the “floating” Iiquid always 
takes the form of a sphere. This resuit suggests that every closed surface 
with constant mean curvature is a sphere, a theorem that is in fact true, 
although the strict mathematical proof of it is very difficult. 

It is possible to approach the question from still another side. In view 
of the fact that the surface tension tends to decrease the area of the 
surface, while the volume of the Iiquid cannot change, it is natural to 
expect that the floating mass of Iiquid will hâve the smallest surface for 
a given volume. It can be proved that a body with this property is a sphere. 

The relation between the latéral pressure of the film and its mean 
curvature can also be used to détermine the form of a soap film suspended 
in a contour. Since the latéral pressure over the surface of the film, being 
directed along the normal to the surface, is not opposed by any reaction 
of the support (the support in this case is simply not there), it must be 
equal to zéro, so that for the desired surface we hâve the condition 

ATav = 0. (6) 

From the analytic expression for mean curvature, we obtain a diflerential 
équation, and the problem consists of solving this équation under the 
condition that the desired surface passes through the given contour.t 
There hâve been many investigations of this difficult problem. 

The same équation (6) arises from the problem of finding the surface 
of least area bounded by a given contour. From a physical point of view, 
the identity of these two problems is a natural one, since the film tends 
to decrease its area and reaches a position of stable equilibrium only 
when it attains the minimal area possible under the given conditions. 
Surfaces of zéro mean curvature, by reason of their connection with this 
problem, are called minimal. 

The mathematical investigation of minimal surfaces is of great interest, 
partly because of their wide variety of essentially different shapes, as 


* The increase of pressure with depth may be ignored, since it is the same for both 
liquids because of their having the same spécifie weight. So on their common boundary 
the additional internai and external pressures caused by the depth are neutralized by 
each other. 

t For a surface given by the équation z - z(x, y), équation (6) assumes the form 
(1 + z'V - ïz'z'z" + (1 + z*)z" = 0. 
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discovered by experiments with soap film. Figure 31 illustrâtes two soap 
films suspended from different contours. 



Fig. 31. 

Gaussian curvature. The Gaussian curvature of a surface at a given 
point is the product of the principal curvatures 

K = k . 

The sign of the Gaussian curvature defines the character of the surface 
near the point under considération. For K > 0 the surface has the form 
of a bowl (k x and k t hâve the same sign) and for K < 0, when k x and Ar 2 
hâve different signs, the surface is like a saddle. The remaining cases, 
discussed earlier, correspond to zéro Gaussian curvature, The absolute 
value of the Gaussian curvature gives the degree of curvature of the 
surface in general, as a sort of abstraction from the various curvatures 
in different directions. This becomes particularly clear if we consider a 
different définition of Gaussian curvature, which does not dépend on 
investigating curves on the surface. 

Let us consider a small segment G of the surface F, containing the 
point M in its interior, and at each point of this segment let us erect 
a normal to the surface. 
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lf we translate the initial points of ail these normals to one point, 
then they fill out a solid angle (figure 32). The size of this solid angle 
will dépend on the area of the segment G and on the extent to which 
the surface is curved on this segment. Thus the degree of curvature of 



the segment G may be characterized by the ratio of the size of the solid 
angle to the area of G; so it is natural to define the curvature of the 
surface at a given point as the limit of this ratio when the segment G 
shrinks to the point A/.* It turns out that this limit is equal to the absolute 
value of the Gaussian curvature at the point M. 

The most remarkable property of the Gaussian curvature, which 
explains its great significance in the theory of surfaces, is the following. 
Let us suppose that the surface has been stamped out from a flexible 
but inextensible material, say a very thin sheet of tin, so that we can 
bend it into various shapes without stretching or tearing it. During this 
process the principal curvatures will change but, as Gauss showed, their 
product k x k i will remain unchanged at every point. This fundamental 
resuit shows that two surfaces with different Gaussian curvatures are 
inherently distinct from each other, the distinction consisting of the fact 
that if we deform them in every possible way, without stretching or 
tearing, we can never superpose them on each other. For example, a 
segment of the surface of a sphere can never be distorted so as to lie 
on a plane or on the surface of a sphere of different radius. 

We hâve now considered certain basic concepts in the theory of surfaces. 
As for the methods used in this theory, they consist, as was stated 
previously, primarily in the application of analysis and above ail of 


* To measure the solid angle itself, we construct a sphere of unit radius with center 
at its vertex. The area of the région in which the sphere intersects the solid angle is 
then talccn as the size of the solid angle (figure 32). 
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difterential équations. Simple examples of the use of analysis are to be 
found in the proofs for the theorems of Euler and Meusnier. For more 
complicated questions, we require a spécial method of relating problems 
in the theory of surfaces to problems in analysis. This method is based 
on the introduction of so-called curvilinear coordinates and was first 
widely used in the work of Gauss on problems of the type discussed in 
the following section. 


§4. Intrinsic Geometry and Deformation of Surfaces 

Intrinsic geometry. As indicated previously, a deformation of a 
surface is defined as a change of shape that préserves the lengths of ail 
curves lying in the surface. For example, rolling up a sheet of paper 
into a cylindrical tube represents, from the géométrie point of view, a 
deformation of part of the plane, since in fact the paper undergoes 
practically no stretching, and the length of any curve drawn on it is not 
changed by its being rolled up. Certain other géométrie quantities 
connected with the surface are also preserved; for example, the area of 
figures on it. Ail properties of a surface that are not changed by defor¬ 
mations make up what is called the intrinsic geometry of the surface. 

But just which are thèse properties? It is clear that in a deformation 
only those properties can be preserved which in the final analysis dépend 
entirely on lengths of curves, i.e., which may be determined by measure- 
ments carried out on the surface itsclf. A deformation is a change of 
shape preserving the length of curves, and any property which cannot 
change under any deformation must be definable in one way or another 
in terms of length. Thus intrinsic geometry is simply called geometry on 
a surface. The very meaning of the words “intrinsic geometry” is that 
it studies intrinsic properties of the surface itself, independent of the 
manner in which the surface is embedded in the surrounding space.* 
Thus, for example, if we join two points on a sheet of paper by a straight 
line and then bend the paper (figure 33), the segment becomes a curve 
but its property of being the shortest of ail fines joining the given points 
on the surface is preserved; so this property belongs to intrinsic geometry. 
On the other hand, the curvature of this fine will dépend on how the 
paper was bent and thus is not a part of intrinsic geometry. 

In general, since the proofs of plane geometry make no référencé to 
the properties of the surrounding space, ail its theorems belong to the 


* We note that the ideas of intrinsic geometry hâve led to a wide generalization of 
the mathematical concept of space and hâve thereby played a very important rôle in 
contemporary physics; for details see Chapter XVII. 
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intrinsic geometry of any surface obtainable by deformation of a plane. 
One may say that plane geometry is the intrinsic geometry of the plane. 

Another example of intrinsic geometry is familiar to everyone, namely 
geometry on the surface of a sphere, with which we usually hâve to deal 



Fig. 33. 


in making measurements on the surface of the earth. This example is a 
particularly good one to illustrate the essential nature of intrinsic geome¬ 
try; because of the large radius of the earth, any immediately visible 
area of its surface appears to us as part of a plane, so that the déviations 
from plane geometry observable in the measurements of large distances 
impresses us as resulting not from the curvature of the earth’s surface 
in space but from the inhérent laws of “terrestrial geometry,” expressing 
the géométrie properties of the surface of the earth itself. 

It remains to note that the idea of studying intrinsic geometry occurred 
to Gauss in connection with the problems of geodesy and cartography. 
Both these applied sciences are concerned in an essential way with the 
intrinsic geometry of the earth’s surface. Cartography deals, in particular, 
with distortions in the ratios of distances when part of the surface of 
the earth is mapped on a plane and thus with distinguishing between 
plane geometry and the intrinsic geometry of the surface of the earth. 

The intrinsic geometry of any surface may be pictured in the same way. 
Let us imagine that on a given surface there exist créatures so small 
that within the limits of their range of vision the surface appears to be 
plane (we know that a sufficiently small segment of any smooth surface 
differs very little from a tangent plane); then these créatures will not 
notice that the surface is curved in space, but in measuring large distances 
they will nevertheless convince themselves that in their geometry certain 
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nonplanar laws prevail, corresponding to the intrinsic geometry of the 
surface on which they live. That these laws are actually different for 
different surfaces may easily be seen from the following simple discussion. 
Let us choose a point O on the surface and consider a curve L such 
that the distance of each of its points from the point O, measured on 
the surface (i.e., along the shortest curve connecting this point to the 
point O) is equal to a fixed number r (figure 34). The curve L, from the 



point of view of the intrinsic geometry of the surface, is simply the 
circumference of a circle of radius r. A formula expressing the length 
s(r) in terms of r is part of the intrinsic geometry of the given surface. 
But such a formula may vary widely in character, depending on the 
nature of the surface: Thus on a plane, s(r) = 2nr; on a sphere of radius 
R, as can easily be shown, s<r) = 2nR sin r/R; on the surface illustrated 
in figure 35, beginning with a certain value of r, the length of the cir¬ 



cumference with center O and radius r is at first independent of r but 
then begins to decrease. Consequently, ail these surfaces hâve different 
intrinsic geometries. 

The basic concepts of intrinsic geometry. To illustrate the wide range 
of concepts and theorems in intrinsic geometry, we may turn to plane 
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geometry which, as we hâve seen, is the intrinsic geometry of the plane. 
Its subject matter consists of plane figures and their properties, which 
are usually expressed in the form of relations among basic géométrie 
quantifies such as length, angle, and area. For a rigorous proof that 
angle and area belong to the intrinsic geometry of the plane, it is necessary 
to show that they can be expressed in terms of length. But this is certainly 
so; in fact, an angle may be computed if we know the length of the sides 
of a triangle containing it, and the area of a triangle can also be computed 
in terms of its sides, while to compute the area of a polygon we r.eed 
only divide it into triangles. 

In considering plane geometry as the intrinsic geometry of the plane, 
there is no need to restrict ourselves to ideas learned in school. On the 
contrary, we may develop it as far as we like and study many new 
problems, provided only they can be stated, in the final analysis, in 
terms of length. Thus, in plane geometry we may successively introduce 
the length of a curve, the area of a surface bounded by curves, and so 
forth; they are ail a part of the intrinsic geometry of the plane. 

The same concepts are introduced in the intrinsic geometry of an 
arbitrary surface. The length of a curve is the initial concept; the définition 
of angles and areas is somewhat more complicated. If the intrinsic 
geometry of a given surface differs from plane geometry, we cannot use 
the customary formulas to define an angle or an area in terms of length. 
However, as we hâve seen, a surface near a given point differs little 
from its tangent plane. Speaking more precisely, the following is true: 
If a small segment of a surface containing a given point M is projected 
on the tangent plane at this point, then the distance between points, 
measured on the surface, differs from the distance between their projec¬ 
tions by an infinitésimal of higher than the second order in comparison 
with distances from the point M. Thus in defining géométrie quantities 



Fig. 36. 
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at a given point of a surface by taking a limit in which infinitesimals 
occur of order no higher than the second, we may replace a segment 
of the surface by its projection on the tangent plane. Thus the quantifies 
determined by measurement in the tangent plane turn out to belong to 
the intrinsic geometry of the surface. This possibility of considering a 
small segment of the surface as a plane is the basis of the définitions of 
ail the concepts of intrinsic geometry. 

As an example let us consider the définitions of angle and area. 
Following the general principle, we define the angle between curves on 
a surface as the angle between their projections on the tangent plane 
(figure 36). Obviously the angle defined in this manner is identical with 
the angle between the tangents to the curves. The définition of area 
given in §3 is based on the same 
principle. Finally, in order that 
the tendency of a curve to twist 
in space may be defined “within” 
the surface itself, we introduce 
the concept of “géodésie curva- 
ture” the name being reminiscent 
of measurements on the surface 
of the earth. The géodésie curva- 
lure of a curve at a given point is 
defined as the curvature of its 
projection on the tangent plane Fro. 37. 

(figure 37). 

In this manner we see that the basic concepts of plane geometry may 
be introduced into the intrinsic geometry of an arbitrary surface. 

In any arbitrary surface it is also easy to define figures analogous to 
the basic figures on the plane. For example, we hâve been dealing 
previously with circumferences of circles, which are defined precisely as 
in the case of the plane. Similarly, we may define the analogue of a line 
segment, namely a géodésie segment, as the shortest curve on the surface 
joining two given points. Further, it is natural to define a triangle as a 
figure bounded by three géodésie segments and similarly for a polygon, 
and so forth. Since the properties of ail these figures and magnitudes 
dépend on the surface, there exist in this sense infinitely many different 
intrinsic geometries. But intrinsic geometry, as a spécial branch of the 
theory of surfaces, pays particular attention to certain general laws 
holding for the intrinsic geometry of any surface and makes clear how 
these laws are expressed in terms of the quantifies which characterize 
a given surface. 

Thus, as we hâve noted earlier, one of the most important characteristics 
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of a surface, its Gaussian curvature, is not changed by deformation, i.e., 
dépends only on the intrinsic geometry of the surface. But it turns out 
that in general the Gaussian curvature already characterizes, to a re- 
markable degree, the extent to which the intrinsic geometry of the surface 
near a given point differs from plane geometry. As an example let us 
consider on a surface a circle L of very small radius r, with center at 
a given point O. On a plane the length sfr) of its circumference is ex- 
pressed by the formula s(r) = 2 nr. On a surface differing from a plane, 
the dependence of the circumference on the radius is different; here the 
déviation of sfr) from 2 nr, dépends essentially, for small r, on the Gaussian 
curvature K at the center of the circle, namely; 

s(r) — 2nr — Kr 3 + «r 3 , 

where e -* 0 as r -* 0. In other words, for small r the circumference 
may be computed by the usual formula if we disregard terms of the 
third degree of smallness, and in this case the error (with accuracy to 
terms of higher than the third order) is proportional to the Gaussian 
curvature. In particular, if K > 0, then the circumference of a circle of 
small radius is smaller than the circumference of a circle with the same 
radius in a plane, and if K < 0, it is larger. These latter facts are easy 
to visualize: Near a point with positive curvature the surface has the 
shape of a bowl so that circumferences are reduced, whereas near a point 
with négative curvature the circumference, being situated on a “saddle,” 
has a wavelike shape and is thus considerably lengthened (figure 38). 


O 




From the theorem just mentioned, it follows that a surface with varying 
Gaussian curvature is extremely inhomogeneous from a géométrie point 
of view; the properties of its intrinsic geometry change from point to 
point. The general character of the problems of intrinsic geometry causes 
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it to resemble plane geometry, but this inhomogeneity, on the other 
hand, makes it profoundly different from plane geometry. On the plane, 
for example, the sum of the angles of a triangle is equal to two right 
angles; but on an arbitrary surface the sum of the angles of a triangle, 
(with geodesics for sides) is undetermined even if we are told that it lies 
on a known surface and has sides of given length. However, if we know 
the Gaussian curvature K at every point of the triangle, then the sum 
of its angles, a, /3, y, can be computed by the formula 

a + /3 + y = rr + JJ K do, 

where the intégral is taken over the surface of the triangle. This formula 
contains as a spécial case the well-known theorems on the sum of the 
angles of a triangle in the plane and on the unit sphere. In the first 
case K = 0 and a + fi + y = n, while in the second K = 1 and 
a + fi + y = rr + S, where S is the area of the spherical triangle. 

It may be proved that every sufficiently small segment of a surface 
with zéro Gaussian curvature may be deformed, or, as it is customary 
to say, developed into a plane, since it has the same intrinsic geometry 
as the plane. Such surfaces are called developable. And if the Gaussian 
curvature is near zéro, then although the surface cannot be developed 
into a plane, still its intrinsic geometry differs little from plane geometry, 
which indicates once again that the Gaussian curvature acts as a measure 
of the extern to which the intrinsic geometry of a surface deviates from 
plane geometry. 

Géodésie Unes. In the intrinsic geometry of a surface the rôle of 
straight lines is played by géodésie lines, or, as they are usually called, 
“geodesics.” 

A straight line in a plane may be defined as a line made up of intervals 
overlapping one another. A géodésie is defined in exactly the same way, 
with géodésie segments taking the place of intervals. In other words, a 
géodésie is a curve on a surface such that every sufficiently small piece 
of it is a shortest path. Not every géodésie is a shortest path in the large, 
as may be noted on the surface of a sphere, where every arc of a great 
circle is a géodésie, although this arc will be the shortest path between 
its end points only if it is not greater than a semicircle. A géodésie, as 
we see, may even be a closed curve. 

To illustrate certain important properties of geodesics, let us consider 
the following mechanical model.* On the surface F let there be stretched 


* As noted previously, our reasoning here is not a strict proof of the properties of 
géodésie curves. It is given only to illustrate the most important of these properties. 
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a rubber string with fixed ends (figure 39). * The string will be in equilibrium 
when it has the shortest possible length, since 
any change in its position will then involve an 
increase of length, which could be produced 
only by external forces. In other words, the 
string will be in equilibrium if it is lying along 
a géodésie. But for equilibrium, it is necessary 
that the elastic forces on each segment of the 
string be counterbalanced by the résistance of 
the surface, directed along the normal to it. 
(We assume that the surface is smooth and that 
there is no friction between it and the string.) 
But it was proved in §2 that the pressure on the support caused by the 
tension of the string is directed along the principal normal to the curve 
along which the string lies. Thus we are led to the following resuit: 
The principal normal to a géodésie at each point coincides in direction 
with the normal to the surface. The converse of this theorem is also 
true: Every curve on a regular surface which has this property is a 
géodésie. 

This property of a géodésie allows us to deduce the following important 
fact: If a material point is moving on a surface in such a way that there 
are no forces acting on it except for the reaction of the surface, then it 
follows a géodésie. For, as we know from §2, the normal accélération 
of a point is directed along the principal normal to the trajectory and 
since the reaction of the surface is the only force acting on the point, 
the principal normal to the trajectory is identical with the normal to 
the surface, so that from the preceding theorem the trajectory is a 
géodésie. This last property of geodesics increases their resemblance to 
straight fines. Just as the motion of a free point, because of inertia, is 
along a straight fine, so the motion of a point forced to stay on a surface, 
but not affected by external forces, will be along a geodesic.t 
From the same property of geodesics cornes the following theorem. 
If two surfaces are tangent along a curve that is a géodésie on one of 
them, then this curve will also be a géodésie on the other. For at each 
point of the curve, the surfaces hâve a common tangent plane and 
consequently a common normal, and since the curve is a géodésie on 
one of the surfaces, this normal coincides with the principal normal to 
the curve, so that on the second surface also the curve will be a géodésie. 


* A stretched string will not remain on a surface unless the surface is convex; so 
in order not to make exceptions, it is better to imagine that the surface is in two layers, 
with the string running between them. 

t Hcre by “external” forces we mean ail forces except the reaction of the surface. 
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From these results follow two further intuitive properties of géodésie 
curves. In the first place, if an elastic rectangular plate (for example a 
Steel ruler) lies with its médian line completely on a surface, then it is 
tangent to this surface along a géodésie. (Evidently the line of contact 
is a géodésie on the ruler, so that it must be a géodésie also on the surface.) 
Second, if a surface rolls along a plane in such a way that the point of 
contact traces a straight line on the plane, then the trace of this straight 
line on the surface is a géodésie.* Both these properties are readily 
demonstrated on a cylinder, where it is easy to convince oneself by ex- 
periment that the médian line of a straight plane strip lying on the cylinder 
(figure 40) coincides with either a generator or the circumference of a 



Fio. 40. 

circle or a hélix, and it is not difficult to prove that a géodésie curve on 
a cylinder can be only one of these three. The same curves will be traced 
out on a cylinder if we roll it on a plane on which we hâve drawn a 
straight line in chalk. 

The analogy between geodesics and straight fines in a plane may be 
supplemented by still another important property, taken directly from 
the définition of a géodésie. Namely, straight fines in the plane may be 
defined as curves of zéro curvature and geodesics on a surface as curves 
of zéro géodésie curvature. (We recall that the géodésie curvature is the 
curvature of the projection of the curve on the tangent plane, cf. figure 37.) 
It is quite natural that our présent définition of a géodésie should coincide 
with the earlier one; for if at every point of the curve the curvature of 

* This proposition does not differ cssentially from the preceding one, since the rolling 
of a surface on a plane is équivalent in a well-defined sense to the unwinding of a 
plane strip along the surface. 
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its projection on the tangent plane is equal to zéro, then the curve départs 
from its tangent essentially in the direction of the normal to the surface, 
so that the principal normal to the curve is directed along the normal to 
the surface and the curve is a géodésie in the original sense. Conversely, 
if a curve is a géodésie, then its principal normal, and so also its déviation 
from the tangent line, are directed along the normal to the surface, so 
that in projecting on the tangent plane we get a curve in which the 
déviation from the tangent is essentially smaller than for the original 
curve, and the curvature of the projection so formed turns out to be 
equal to zéro. 

The course of a géodésie may vary widely for different surfaces. As 
an example, in figure 41 we trace some geodesics on a hyperboloid of 
révolution. 





Fio. 41. 


Deformation of surfaces. Since intrinsic geometry studies the proper- 
ties of surfaces that are invariant under deformation, it naturally investi- 
gates these deformations themselves. The theory of deformation of 
surfaces is one of the most interesting and difficult branches of geometry 
and includes many problems which, although simple to State, hâve not 
yet been finally solved. 

Certain questions about the deformation of surfaces were already 
considered by Euler and Minding, but general results for arbitrary 
surfaces were not derived until later. 

In the general theory of deformation, we first of ail raise the question 
whether deformation is possible for ail surfaces and, if so, to what extent. 
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For analytic surfaces, i.e., surfaces defined by functions of the coordinates 
that can be expanded in a Taylor sériés, this question was solved at the 
end of the last century by the French mathematician Darboux. In par- 
ticular, he.showed the following: If on such a surface we consider any 
géodésie and assign in space an arbitrary (analytic) curve with the same 
length, and with curvature nowhere equal to zéro, then a sufficiently 
narrow strip of the surface, containing the given géodésie, can be deformed 
so that the géodésie coïncides with the given curve.* This theorem shows 
that a strip of the surface may be deformed rather arbitrarily. However, 
it has been proved that if a géodésie is to be transformed into a preassigned 
curve, then the surface may be deformed in no more than two ways. 
For example, if the curve is plane, then the two positions of the surface 
will be mirror images of each other in the plane. If the géodésie is a 
straight line, then this last proposition is not true, as can be shown by 
deforming a cylindrical surface. 

We hâve defined a deformation as a transformation of the surface that 
préserves the lengths of ail curves on the surface. Here we hâve considered 
only the final resuit of the transformation; the question of what happens 
to the curve during the process did not enter. However, in considering 
a surface as made from a flexible but unstretchable material, it is natural 
to consider a continuous transformation, at each instant of which the 
lengths remain unchanged (physically this corresponds to the un- 
stretchability of the material). Such transformations are called continuous 
deformations. 

At first glance it may seem that every deformation can be realized in 
a continuous manner, but this is not so. For example, it has been shown 
that a surface in the form of a circulât trough (figure 42), does not admit 




continuous deformations (this explains, among other things, the familiar 
fact that a pail with a curved rim is considerably stronger than one with 
a plain rim) although deformations of such a surface are possible: for 

* The case of transforming a géodésie into a curve with zéro curvature is excluded, 
since it is easy to show that for surfaces of positive Gaussian curvature this is impossible. 
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example, one may cuî the trough along the circle on which it rests on 
a horizontal plane and replace one half of it by its mirror image in this 
plane (compare figure 43 with figure 42; to aid visualization we hâve 
drawn only the left half of the surface). It is intuitively clear that the 
impossibility of a continuous deformation is due to the circular shape 
of the trough; for a straight trough such a deformation can be performed 
continuously. 

If we restrict ourselves to a sufficiently small segment of the surface, 
then there are no obvious hindrances to its continuous deformation, and 
we might expect that every deformation of a small segment of the surface 
can be realized by a continuous transformation, followed perhaps by a 
mirror reflection. This is in fact true, but only under the condition that 
on the given small segment of the surface the Gaussian curvature never 
vanishes (excepting the case that it vanishes everywhere). But if the 
Gaussian curvature vanishes at isolated points, then, as N. V. Efimov 
showed in 1940, even arbitrarily small segments of a regular surface may 
not admit any continuous deformation without loss of regularity. For 
example, the surface defined by the équation z — .v* + Ajdy -f y *, where 
A is a transcendental number, has the property that no segment containing 
the origin, no matter how small it may be, admits sufficiently regular 
continuous deformations. Efimov’s theorem is a new and somewhat 
unexpected resuit in classical differential geometry. 

In addition to these general questions about deformation, a great deal 
of attention is being paid to spécial types of deformation of surfaces. 

The connection of the intrinsic geometry of a surface with the form of 
the surface in space. We already know that certain properties of a 
surface, and of the figures on it, are defined by the intrinsic geometry 
of the surface even though these properties are very closely related to 
other properties that dépend on how the surface is embedded in the 
surrounding space, properties that are, as they say, “extrinsic” to the 
surface. For example, the principal curvatures are extrinsic properties of 
a surface, but their product (the Gaussian curvature) is intrinsic. Another 
example, in order that the principal normal of a curve lying on a surface 
should coincide with the normal to the surface, it is necessary and 
sufficient that this curve hâve a property defined by its intrinsic geometry, 
namely that it be a géodésie. 

Consequently, the intrinsic geometry of a surface will détermine its 
space form only to a certain extern. 

The dependence of the space form of a surface on its intrinsic geometry 
may be expressed analytically in the form of équations containing certain 
quantities that characterize the intrinsic geometry and certain other 
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quantities that characterize the way in which the curved surface is 
embedded in space. One of these équations is the formula expressing 
the Gaussian curvature in intrinsic terms and is due to Gauss. Two other 
such équations are those of Peterson and Codazzi, mentioned in §1. 

The équations of Gauss, Peterson, and Codazzi completely express the 
connection between the intrinsic geometry of a surface and the character 
of its curvature in space, since ail possible interrelations between intrinsic 
and extrinsic properties of an arbitrary surface are included, at least in 
implicit form, in these équations. 

Since the form of a surface in space is not completely defined by its 
intrinsic geometry, we naturally ask, What extrinsic properties must still 
be assigned in order to détermine the surface completely? It turns out 
that if two surfaces hâve the same intrinsic geometry and if, at correspond- 
ing points and in corresponding directions, the curvatures of the normal 
sections of these surfaces hâve the same sign, then the surfaces are 
congruent; that is, they can be translated so as to coincide with each 
other. We note that Peterson discovered this theorem 15 years earlier 
than Bonnet, with whose name it is usually associated. 

Analytic apparatus in the theory of surfaces. The systematic applica¬ 
tion of analysis to the theory of surfaces led to the building up of an 
analytic apparatus especially suitable for this purpose. The décisive step 
in this direction was taken by Gauss, who introduced the method of 
representing surfaces by so-called curvilinear coordinates. This method 
is a natural generalization of the idea of Cartesian coordinates on the 
plane and is closely connected with the intrinsic geometry of the surface, 
for which the présentation of the surface by an équation of the form 
z = f(x, y) is not convenient. The inconvenance consists of the fact that 
the x, y coordinates of a point on the surface change when the surface 
is deformed. To eliminate this difficulty, the coordinates are chosen on 
the surface itself; they define each point by two numbers u and v, which 
are associated with the given point and remain associated with it even 
after deformation of the surface. The space coordinates x, y, z of the 
point will in each case be functions of u and v. The numbers u and v 
defining the point on a surface are called its curvilinear coordinales. 
The choice of name is to be understood as follows: If we fix the value 
of one of these coordinates, say v , and vary the other, then we get a 
coordinate curve on the surface. The coordinate curves form a curvilinear 
net on the surface, similar to the coordinate net on a plane. We note 
that the familiar method of describing the position of a point on the 
surface of the earth by means of longitude and latitude consists simply of 
introducing curvilinear coordinates on the surface of a sphere; the coor- 
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dinate net in this case consists of circles, namely the meridians and 
parallels* (figure 44). To describe the spatial position of a surface by 
means of curvifinear coordinates, we need to define the position of each 
point in terms of u and v, for example by giving, as a function of u and v, 
the vector r = r(u, v), issuing from some fixed origin to the points on 
the surface and called the radius vector of the surface. (This is équivalent 
to giving the x, y, and z components of the vector r as functions of u 
and t>.)t To define a curve lying on a given surface, we need to give the 
coordinates u, v as functions of one parameter t; then the radius vector 
to a point moving along this curve is expressed as a composite function 

'MO. KOI- 

For vector functions the concepts of dérivative and difierential may be 
generalized word for word; from the définition of the dérivative as the 
limit of Ar/At when At -* 0 (r is a function of the parameter l) it follows 



* It is characteristic that géographie coordinates and their practical applications 
were known long in advancc of Descartes' introduction of the usual coordinates in 
the plane. 

t Of course Gauss did not use vector notation, but defined the three coordinates 
x, y, 2 of the points of the surface separately as functions of u and v. Vectors, which 
were introduced as a resuit of the work of Hamilton and Grassmann, were at first used 
widely in physics and only later (in fact, in the 20th century) bccamc the traditional 
apparatus for analytic and difierential geometry. 
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at once that the dérivative of the radius vector of a curve is a vector 
directed along the tangent to the curve (figure 45). For vector functions 
the basic properties of ordinary dérivatives are still valid; for example, 
the chain rule 


dr[u(t), t</)] dr du dr dv . 

dt -V u -dt + fr~dt- r * Ui + r ' v " 


(7) 


where r u and r v are the partial dérivatives of the vector function r(u, v). 
The length of a curve, as can be shown, is expressed by the intégral 


5 = | Vx'\t) + ÿ\l) + z'*(0 dt. 
Thus, the differential of the length of a curve is equal to 


ds = Vx-*(t) + ÿ\t) + z'\t) dt. 

But since x\t), ÿ(t), and z'(t ) are components of the vector drfot = r' t , 
we may write ds = \r\\ dt, where | rj | dénotés the length of the vector r\. 
For curves lying on a surface, we get from (7) 

ds = | r u u' t + r. v', | dt. 

Computing the square of the length of the vector on the right we obtain, 
by the rules of vector algebra,* 

ds* = [r*«;* + 2r u r v u' t v\ + dt*. 

Passing to differentials and introducing the notation 

rl = E{u, v), r u r v = F(u, v), r\ = G(u, u). 


we hâve 

ds 2 = E du 2 + 2F du dv + G dv*. 

We see that the square of the differential of arc length on a surface is 
a quadratic form in the differentials du and dv with coefficients depending 
on the point of the surface. This form is called the first fondamental 
quadratic form of the surface. Given the coefficients E, F, and G of this 


* The square of the length of a vector is the scalar product of the vector with itself, 
and for scalar multiplication (cf. Chapter III, §9) the usual rules hold for the rcmoval 
of brackets. 
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form at each point on a surface we may compute the length of any curve 
on the surface by the formula 


s = J VEu'* + 2 Fu' t v\ + Gv'* dt. 


so that its intrinsic geometry is thereby completely determined. 

We show, as an example, how to express angle and area in terms of 
E, F, and G. Let two curves issue from a given point, one of them given 
by the équations u — «/,(/), v = and the other by the équations 
u = u t (t), v = v£t). Then the tangents to these curves are given by the 


vectors 


du x dv x 

ut• 

du, dv, 

r,= r„^ + r.^. 


The cosine of the angle between these vectors is equal to the scalar 
product r,r t divided by the product of the lengths r x r t 


2 du x du 2 , / du t dv t , du t di\ \ ( 2 di\ dv 2 

ru ~di"di r “M ~dT~di ~dT~dT) rv ~di ~di 


du x dv t du 2 dv x \ 2 dv x dv 2 


Recalling that r„ = £, r u r v = F, r* = G, we get 


du^dut / du\ de t dut dv x \ - dv x dv t 

dtdt \ dt dt dt dt f dt dt 


To obtain a formula for area, we consider a curvilinear rectangle 
bounded by the coordinate curves u — u 0 , v = v„ , u = u 0 + Au, 
v = v 0 + Av, and we take as an approximation to it the parallelogram 
lying in the tangent plane and bounded by the vectors r u Au,r v Av, tangent 
to the coordinate curves (figure 46). The area of this parallelogram is 
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As = | r u | | r v | Au Av sin <f>, where <f> is the angle between r u and r v . Since 
àn<f> = V1 — cos 2 <f>, it follows that As = \ r u \ \ r v \ Au Av Vl — cos*<£ 
= v'r*r* — | r u |* | r„ |* cos 2 </> Ju Jr>. Recalling that r* = £, r* = G, 
| r u | • | r v | cos^> = r u r„ = F, we get - VEG — F* Au Av. Summing 
up the areas of the parallelograms and taking the limit as Au, Av -*■ 0 
we obtain the formula for area S = JJ d VeG — F*dudv, where the 
intégration is taken over the domain D of the variables u and v which 
describe the given segment of the surface. 

In this way, curvilinear coordinates are very convenient for studying 
the intrinsic geometry of a surface. 

It also turns out that the manner in which a curved surface is embedded 
in the surrounding space can be characterized by a certain quadratic 
form in the differentials du, dv. Thus if n is a unit vector normal to the 
surface at the point M, and Ar is the incrément in the radius vector to 



the surface as we move from this point, then the déviation h of the surface 
from the tangent plane (figure 47) is equal to n Ar. Expanding the incré¬ 
ment Ar by Taylor’s formula, we get 

h = ndr + J /» dhr + e(du 2 + dv 2 ), 

where e -*■ 0 as Vdu 2 + dv 2 -* 0. Since the vector dr lies in the tangent 
plane, we hâve ndr = 0. The last term, t(du 2 + dv 2 ) is small in com- 
parison with the squares of the differentials du and dv. There remains 
the principal term \n cPr. Thus twice the principal part of h, namely 
n (Fr, is a quadratic form with respect to du and dv 

n d 2 r = nr uu du 2 + 2nr uv du dv + nr vv dv 2 . 

This form describes the character of the déviation of the surface from 
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the tangent plane. It is called the second fondamental quadratic form of 
the surface. Its coefficients, which dépend on u and v, are usually written: 

*r„ = L, nr uv = M, nr vv = N. 

Knowing the second fundamental quadratic form, we çan compute 
the curvature of any curve on a surface. Thus, applying the formula 
k = lim,_ 0 2A// 2 , we obtain the resuit that the curvature of the normal 
section in the direction corresponding to the ratio du/dv is equal to 

, n d' l r L du 2 + 2M du dv + N dv 2 

” ~ ~ Edu 2 + 2Fdudv + G dit ' 

If the curve is not a normal section, then by Meusnier’s theorem it is 
sufficient to divide the curvature of the normal section in the same direction 
by the cosine of the angle between the principal normal to the curve 
and the normal to the surface. 

The introduction of the second fundamental quadratic form provides 
an analytic approach to the study of how the surface is curved in space. 
In particular, one may dérivé the theorems of Euler and Meusnier, the 
expressions for the Gaussian and mean curvature, and so forth, in a 
purely analytic way. 

Peterson’s theorem, mentioned earlier, shows that the two quadratic 
forms, taken together, define a surface up to its position in space, so 
that the analytic study of any properties of a surface consists of the study 
of these forms. In conclusion, we note that the coefficients of the two 
quadratic forms are not independent; the connection mentioned earlier 
between the intrinsic geometry of a curved surface and the way in which 
it is embedded in space is expressed analytically by three relations (the 
équations of Gauss-Codazzi) between the coefficients of the first and the 
second fundamental quadratic forms. 

§5. New Developments in the Theory of Curves and Surfaces 

Families of curves and surfaces. Even though the basic theory of 
curves and surfaces was to a large degree complété by the middle of the 
last century, it has continued to develop in several new directions, which 
greatly extend the range of figures and properties investigated in con- 
temporary differential geometry. There is one of these developments 
whose origins go back to the beginning of differential geometry, namely 
the theory of “families” or ofcontinuous collections of curves and surfaces, 
but this theory may be considered new in the sense that its more profound 
aspects were not investigated until after the basic theory of curves and 
surfaces was already completely developed. 
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In general a continuons collection of figures is called an n-parameter 
family if each figure of the collection is determined by the values of n 
parameters and ail the quantities characterizing the figure (in respect to 
its position, form, and so forth) dépend on these parameters in a manner 
which is at least continuous. From the point of view of this general 
définition, a curve may be considered as a one-parameter family of points 
and a surface as a two-parameter family of points. The collection of ail 
circles in the plane is an example of a three-parameter family of curves, 
since a circle in the plane is determined by three parameters: the two 
coordinates of its center and its radius. 

The simplest question in the theory of families of curves or surfaces 
consists of finding the 
so-called envelope of the 
family. A surface is called 
the envelope of a given 
family of surfaces if at 
each of its points it is 
tangent to one of the 
surfaces of the family and 
is in this way tangent to 
every one of them. For 
example, the envelope of 
a family of spheres of 
equal radius with centers on a given straight line will be a cylinder 
(figure 48), and the envelope of such spheres with centers on ail points 
of a given plane will consist of two parallel planes. The envelope of a 

family of curves is defined similar- 
ly. Figure 49 diagrams jets of 
water issuing from a fountain at 
various angles; in any one plane 
they form a family of curves, 
which may be considered approxi- 
mately as parabolas; their enve¬ 
lope stands out clearly as the 
general contour of the cascade of 
water. Of course, not every family 
of curves or surfaces has an enve¬ 
lope; for example, a family of 
parallel straight lines does not 
hâve one. There exists a simple 
general method of finding the 
envelope of any family; for a 




Fig. 48. 


Fig. 49. 
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famjly of curves in the plane this method was given by Leibnitz. 

Every curve is obviously the envelope of its tangents, and in exactly 
the same way every surface is the envelope of its tangent planes. Inciden- 
tally, this fact provides a new method of defining a curve or a surface 
by giving the family of its tangent lines or planes. For some problems 
this method turns out to be the most convenient. 

Generally speaking, the tangent planes of a surface are different at 
different points, so that the family of tangents to the surface is obviously 
a two-parameter one. But in some cases, for example, a cylinder, it is 
one parameter. It can be shown that the following remarkable theorem 
holds. A one-parameter family of tangent planes occurs only for those 
surfaces that are developable into a plane, i.e., those in which any 
sufficiently small segment may be deformed into a plane segment; these 
are the developable surfaces noted in §4. Every analytic surface of this 
kind consists of segments of straight lines and is either cylindrical 
(parallel straight lines) or conical (straight lines passing through one 
point), or consists of the tangents to some space curve. 

The theory of envelopes is particularly useful in engineering problems, 
for example in the theory of transmissions. We consider two gears A 

and B. To study their motion relative 
to each other, we may assume that 
gear A is stationary and gear B moves 
around it (figure 50). Then the contour 
of a cog on gear B, as it assumes 
various positions, traces out a family of 
curves in the plane of gear A, and the 
contour of gear A must at ail times be 
tangent to them, i.e., must be the 
envelope of the family. Of course, this 
is not a complété statement of the 
situation, since in an actual transmission this engagement must be 
transferred from one pair of cogs to the next, but this condition is never- 
theless the basic one which must be satisfied by every type of gear. 

As we hâve said, the question of envelopes is a relatively simple one, 
solved long ago, in the theory of families of curves and surfaces. This 
theory is just as rich in interesting problems as, let us say, the theory of 
surfaces itself. Especially well developed is the theory of “congruences,” 
i.e., two-parameter families of various curves (and in particular of straight 
lines: the so-called “straight-Iine” congruences). In this theory one 
applies essentially the same methods as in the theory of surfaces. 

The theory of straight-line congruences originated in the paper of 
Monge, “On excavations and fills,” the title of which already shows that 



Fio. 50. 
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Monge undertook the investigation for practical purposes; the main idea 
was to find the most convenient way of transporting earth from an 
excavation to a fill. 

The systematic development of the theory of congruences, beginning 
in the middle of the last century, is due in large measure to its connection 
with géométrie optics; the set of rays of light in a homogeneous medium 
at any time constitutes a straight-line congruence. 

Nonregular surfaces and geometry “in the large.” The theory of 
curves and surfaces (and of families of them), as it had been constructed 
by the end of the last century, is usually called classical differential 
geometry; it has the following characteristic features. 

First, it considers only “sufficiently smooth” (i.e., regular) curves and 
surfaces, namely those which are defined by functions with a sufficient 
number of dérivatives. Thus, for example, surfaces with cusps or edges, 
such as polyhedral surfaces or the surface of a cône, are either excluded 
from the argument or are considered only on the parts where they remain 
smooth. 

Second, classical differential geometry pays especial attention to 
properties of sufficiently small segments of curves and surfaces (geometry 
“in the small”) and nowhere considers properties of an entire closed 
surface (geometry “in the large”). 

Typical examples, illustrating the distinction between geometry “in the 
small” and “in the large" are provided by the deformation of surfaces. 
For example, already in 1838 Minding showed that a sufficiently small 
segment of the surface of a sphere can be deformed, and this is a theorem 
“in the small.” At the same time, he expressed the conjecture that the 
entire sphere cannot be deformed. This theorem was proved by other 
mathematicians as late as 1899. Incidentally, it is easy to confirm by 
experiment that a sphere of flexible but inextensible material cannot be 
deformed. For example, a ping-pong bail holds its shape perfectly well 
although the material it is made from is quite flexible. Another example, 
mentioned in §4, is the tin pail; it is rigid in the large, thanks to the presence 
of a curved flange, but separate pièces of it can easily be bent out of 
shape. As we see, there is an essential différence between properties of 
surfaces “in the small” and “in the large.” 

Other characteristic examples are provided by the theory of geodesics, 
discussed in §4. A géodésie “in the small,” i.e., on a small segment of 
the surface, is a shortest path, but “in the large” it may not be so at ail; 
for example, it may even be a closed curve, as was pointed out earüer 
for great circles of a sphere. 

The reader will readily note that the theorems on geodesics formulated 
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in §4 are basically theorems “in the small.” Questions on the behavior 
of géodésie curves throughout their whole course will belong to geometry 
“in the large.” It is known, for example, that on a regular surface two 
sufficiently adjacent points can be joined by a unique géodésie, remaining 
entirely in a certain small neighborhood of two points. But if we consider 
geodesics that during their course may départ as far as we like from 
the two points, then by a theorem of Morse any pair of points on a 
closed surface may be joined by an infinité number of geodesics. Thus, 
two points A and B on the latéral surface of a curved cylinder may be 
joined by very different geodesics: it is sufficient to consider helices which 
run from A to B but wind around the cylinder a different number of 
times. The theorem of Poincaré on closed geodesics, stated in §5 of 
Chapter XVIII, and proved by Ljusternik and Snirelman, also belongs 
to geometry “in the large.” 

The proofs for these theorems, as for many theorems of geometry 
“in the large,” were inaccessible with the usual tools of classical differential 
geometry and required the invention of new methods. 

When these problems of geometry “in the large” were inevitably 
attracting the attention of mathematicians, the restriction to regular 
surfaces could no longer be maintained, if only because we are continually 
encountering surfaces that are not regular but hâve discontinuous curva- 
ture ; for example, convex lenses with a sharp edge, and so forth. Moreover, 
there are many analytic surfaces that cannot be extended in any natural 
way without acquiring “singularities” in the form of edges or cusps and 
thus becoming nonregular. 

Thus, a segment of the surface of a cône cannot be extended in a natural 
way without leading to the vertex, a cusp where the smoothness of the 
surface is destroyed. 

This last resuit is only a particular case of the following remarkable 
theorem. Every developable surface other than a cylinder will lead, if 
naturally extended, to an edge (or a cusp in the case of a cône) beyond 
which it cannot be continued without losing its regularity. 

Thus there is a profound connection between the behavior of a surface 
“in the large” and its singularities. This is the reason why the solution 
of problems “in the large” and the study of surfaces with “singularities” 
(edges, cusps, discontinuous curvature and the like) must be worked out 
together. 

Similar new directions were taken in analysis. For example, the 
qualitative theory of differential équations mentioned in §7 of Chapter V, 
studies the properties of solutions of a differential équation in its entire 
domain of définition, i.e., “in the large,” paying particular attention to 
“singularities,” i.e., to violations of regularity, and to singular points of 
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the équation. Moreover, contemporary analysis includes the study of 
nonregular functions which did not occur in classical analysis (cf. Chapter 
XV) and thereby provides geometry with a new means of studying more 
general surfaces. Finally, in the calculus of variations, where we are 
usually looking for curves or surfaces with some extremal property, it 
sometimes happens that the limit curve, for which the extreme is attained, 
is not regular. For such problems it is necessary that the class of curves 
or surfaces under considération should be closed (that is, should include 
ail its limit curves or surfaces), a fact which necessarily led to the study 
of at least the simplest nonregular curves and surfaces. In a word, the 
new directions taken by geometry did not originate in isolation but in 
close connection with the whole development of mathematics. 

The turning of attention to problems “in the large” and nonregular 
surfaces began about 50 years ago and was shared by many mathemati- 
cians. The first essential step was taken by Hermann Minkowski (1864- 
1909), who laid the foundation for an extensive branch of geometry, the 
theory of convex bodies. Incidentaliy, one of the questions which started 
Minkowski on his investigations was the problem of regular lattices, 
which is closely connected with the theory of numbers and géométrie 
crystallography. 

A body is called convex if through each point of its surface we may 
pass a plane that does not intersect the body, i.e., at any point of its 
surface the body may rest on a plane (figure 51). A convex body is defined 



Fig. 51. 


by its surface alone, so that for the most part it makes no différence 
whether we speak of the theory of convex bodies or of closed convex 
surfaces. The general theorems on convex bodies are proved, as a rule, 
without any additional assumptions about the smoothness or “regularity” 
of their surfaces. Thus these theorems are usually concerned with the 
whole convex body or surface, so that the restrictions of classical dif- 
ferential geometry are automatically removed. However, the two théories 
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(of convex bodies and of nonregular surfaces) were at first very little 
connected with each other, the combination of the two taking place 
considerably later. 

Beginning in 1940, A. D. Aleksandrov developed the theory of general 
curves and surfaces, including both the regular surfaces of classical 
differential geometry and also such nonsmooth surfaces as polyhedra, 
arbitrary convex sets, and others. In spite of the great generality of this 
theory, it is chiefly based on intuitive géométrie concepts and methods, 
although it also makes essential use of contemporary analysis. One of 
the basic methods of the theory consists of approximating general surfaces 
by means of polyhedra (polyhedral surfaces). This device in its simplest 
form is known to every schoolboy, for example, in computing the area 
of the latéral surface of a cylinder as the limit of the areas of prisms. 
In a number of cases the method produces strong results that either 
cannot be derived in another way or else, if they are to be proved by an 
analytic method, require the introduction of complicated ideas. Its 
essential feature consists of the fact that the resuit is first obtained for 
polyhedra and is then extended to general surfaces by a limit process. 

One of the beginnings of the theory of general convex surfaces was 
the theorem on the conditions under which a given evolute (cf. figure 52) 

may be pasted together 
to form a convex poly- 
hedron. This theorem, 
completely elementary in 
its formulation, has a 
nonelementary proof and 
leads to far-reaching cor- 
ollaries for general con¬ 
vex surfaces. The reader 
is, of course, familiar with 
the pasting together of a 
polyhedral surface from 
segments; for example, 
the assembling of a cube 
from the cross-shaped 
pattern in figure 52, or 
Fio. 52. of a cylinder from a rec¬ 

tangle and two circles. 
This simple example of assembling surfaces from segments of them is 
converted into a general method of “cutting apart and pasting together,” 
which has produced profound results in various questions of the theory 
of surfaces and has found practical applications. 
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Deep-lying results in this theory were obtained by A. V. Pogorelov. 
In particular, he showed that every closed convex surface cannot be 
deformed as a whole with préservation of its convexity. This resuit, 
achieved in 1949, complétés the efforts of many well-known mathemati- 
cians, who for the preceding 50 years had tried to prove it but had been 
successful only under various additional hypothèses. The results of 
Pogorelov, in conjunction with the “method of pasting together,” not 
only provided a complété solution for the problem, but almost completely 
cleared up the whole question of the deformability or nondeformability 
of closed and nonclosed convex surfaces. They also established a 
close connection between the new theory and “classical” difîerential 
geometry. 

In this way a theory of surfaces was constructed that included the 
classical theory as well as the theory of polyhedra, of arbitrary convex 
surfaces, and of very general nonconvex surfaces. Lack of space does 
not allow us to discuss in detail the results or the still unsolved problems 
of the theory, although this could readily be done, since they are for 
the most part quite easily visualized and, in spite of the difficulty of 
exact proofs, do not require any spécial knowledge. 

In §4, in speaking of the deformation of surfaces, we had in mind 
deformations of a regular (continuously curved) surface that preserved 
its regularity. But in the theorem of Pogorelov, on the contrary, there 
is no requirement of regularity for either the initial or the deformed 
surface, although the requirement of convexity is imposed on both 
surfaces. 

It is obvious that deformation of a sphere, for example, becomes 
possible if we allow breaks in the surface and violation of the convexity. 
It is sufficient to eut out a segment of the surface and then replace it 
after the deformation; that is, so to speak, to push a segment of the 
surface into the interior. Considerably more unexpected is the resuit 
obtained recently by the American mathematician Nash and the Dutch 
mathematician Kuiper. They showed that if we preserve only the smooth- 
ness of a surface and allow the appearance of any number of sharp jumps 
in the curvature of the surface (i.e., if we eliminate any requirement of 
continuity, boundedness, or even existence of the second dérivatives of 
the functions defining the surface) then it turns out to be possible to 
deform the surface as a whole with a very great degree of arbitrariness. 
In particular a sphere may be deformed into an arbitrarily small bail, 
which has a smooth surface consisting of very shallow wavelike creases. 
Some idea of a deformation of this sort may be gained by the easily 
imagined possibility of rumpling up into almost any shape a spherical 
cover made of very soft cloth. On the other hand, a small celluloid bail 
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behaves very differently. The elastic material of its surface resists not 
only extension but also sharp bending, so that such a bail is very rigid. 

Differential geometry of various groups of transformations. At the 
beginning of this century, there arose from classical differential geometry 
a sériés of new developinents based on one general idea, namely the study 
of properties of curves, surfaces, and families of curves and surfaces which 
remain invariant under various types of transformations. Classical 
differential geometry investigated properties invariant under translation; 
but of course there is nothing to prevent us from considering other 
géométrie transformations. For example, a projective transformation is 
one in which straight lines remain straight, and projective geometry, 
which has been in existence for a long time, studies those properties of 
figures that remain invariant under projective transformations. Ordinary 
projective geometry remains similar, in the problems it investigates, to 
the usual elementary and analytic geometry, whereas “projective dif¬ 
ferential geometry” (the theory of curves, surfaces, and families developed 
at the beginning of the présent century) is similar to classical differential 
geometry, except that it studies properties that are invariant under 
projective transformations. Fundamental in this last direction were the 
contributions of the American Wilczynski, the Italian Fubini, and the 
Czech mathematician, Cech. 

In the same way arose “affine differential geometry,” which studies 
the properties of curves, surfaces, and families invariant under affine 
transformations, i.e., under transformations that not only take straight 
lines into straight lines but also préserve parallelism. The work of the 
German mathematician Blaschke and his students developed this branch 
of geometry into a general theory. Let us also mention “conformai 
geometry,” in which one studies the properties of figures invariant under 
transformations that do not change the angles between curves. 

In general, the possible “geometries” are very diverse in character, 
since essentially any group of transformations may serve as the basis of 
a “geometry,” which then studies just those properties of figures that 
are left unchanged by the transformations of the group. This principle 
for the définition of geometries will be discussed further in Chapter XVII. 

Other new directions in differential geometry are being successfully 
developed by Soviet geometers, S. P. Finikov, G. F. Laptev, and othérs. 
But in our présent outline it is not possible to give an account of ail 
the various investigations that are taking place nowadays in the different 
branches of differential geometry. 
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VIII 


THE CALCULUS OF 
VARIATIONS 


§!. Introduction 

Examples of variations! problems. We will be able to give a clearer 
description of the general range of problems studied in the calculus of 
variations,* if we first consider certain spécial problems. 

I. The curve of fastes! descent. The problem of the brachistochrone, 
or the curve of fastest descent, was historically the first problem in the 
development of the calculus of variations. 

Among ail curves connecting the points A/, and A/ 2 , it is required 
to find that one along which a mathematical point, moving under the 
force of gravity from A/,, with no initial velocity, arrives at the point M 2 
in the least time. 

To solve this problem we must consider ail possible curves joining M x 
and A/j. If we choose a definite curve /, 
then to it will correspond some definite 
value T of the time taken for the descent 
of a material point along it. The time T 
will dépend on the choice of /, and of ail 
curves joining A/, and M 2 we must 
choose the one which corresponds to 
the least value of T. 

The problem of the brachistochrone 
may be expressed in the following way. 

We draw a vertical plane through the 

* The dérivation of the name “calculus of variations” is explained later. 
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Fig. 1. 
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points A/, and M 2 . The curve of fastest descent must obviously lie in it, 
so that we may restrict ourselves to such curves. We take the point A/, 
as the origin, the axis Ox horizontal, and the axis Oy vertical and 
directed downward (figure I). The coordinates of the point A/, will be 
(0, 0); the coordinates of the point M 2 we will call (x 2 , y 2 ). Let us 
consider an arbitrary curve described by the équation 

y=f(x), 0 (1) 

where f is a continuously différentiable function. Since the curve passes 
through A/, and M 2 , the function / at the ends of the segment [0, x 2 ] 
must satisfy the condition 

AO) = 0, /(**) = y 2 ■ (2) 

If we take an arbitrary point M(x,y) on the curve, then the velocity 
v of a material point at this point of the curve will be connected with 
the >>-coordinate of the point by the well-known physical relation 

£ v* = gy, 

or 

v = V2gÿ. 

The time necessary for a material point to travel along an element ds 
of arc of the curve has the value 


J i _VT+2 dx 
v V2gÿ 


and thus the total time of the descent of the point along the curve from 
A/, to M 2 is equal to 


T = 


, J*. VT + 2*. 

V2 g\ Vy 


(3) 


Finding the brachistochrone is équivalent to the solution of the following 
minimal problem: Among ail possible functions (1) that satisfy conditions 
(2), find that one which corresponds to the least value of the intégral (3). 


2. The surface of révolution of the least area. Among the curves 
joining two points of a plane, it is required to find that one whose arc, 
by rotation around the axis Ox, generates the surface with the least area. 

We dénoté the given points by A/,(*,, >>,) and M 2 (x 2 , y 2 ) and consider 
an arbitrary curve given by the équation 

y =/(*)■ 


(4) 
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If the curve passes through M 1 and , the function / will satisfy 
the condition 

AxJ = y x , /(->fî) = y t ■ (5) 

When rotated around the axis Ox this curve describes a surface with 
area numerically equal to the value of the intégral 

S = 2n J*' y VI + /* dx. (6) 

x i 

This value dépends on the choice of the curve, or equivalently of the 
function y = f(x). Among ail functions (4) satisfying condition (5) we 
must find that function which gives the least value to the intégral (6). 

3. Uniform deformation of a membrane. By a membrane we usually 
mean an elastic surface that is plane in the State of rest, bends freely, 
and does work only against extension. We assume that the potential 
energy of a deformed membrane is proportional to the increase in the 
area of its surface. 

In the State of rest let the membrane occupy a domain B of the Oxy 
plane (figure 2). We deform the 
boundary of the membrane in a 
direction perpendicular to Oxy and 
dénoté by <f>(M) the displacement of 
the point M of the boundary. Then 
the interior of the membrane is also 
deformed, and we are required to 
find the position of equilibrium of 
the membrane for a given deforma¬ 
tion of its boundary. 

With a great degree of accuracy 
we may assume that ail points of the 
membrane are displaced perpendic- 
ularly to the plane Oxy. We dénoté 
by u(x, y) the displacement of the point (x, y). The area of the membrane 
in its displaced position will be* 

// (1 + «4 + wjj) I/2 dx dy. 

B 



* Here and everywhere in this chapter we use subscripts to dénoté the arguments 
with respect to which the partial dérivatives are taken. 
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If the deformations of the éléments of the membrane are so small that 
we can legitimately ignore higher powers of u x and u v , this expression 
for the area may be replaced by a simpler one: 

JJ [‘ +{(ul + ul)\dxdy. 

B 

The change in the area of the membrane is equal to 
\ jj(ul + u\)dxdy; 

B 

so that the potential energy of the deformation will hâve the value 


ïjj(ul + £)dxdy, (7) 

B 

where n is a constant depending on the elastic properties of the membrane. 

Since the displacement of the points on the edge of the membrane is 
assumed to be given, the function u(x, y) will satisfy the condition 

u\,=<KM) (8) 

on the boundary of the domain B. 

In the position of equilibrium the potential energy of the deformation 
must hâve the smallest possible value, so that the function u(x, y), de- 
scribing the displacement of the points of the membrane, is to be found 
by solving the following mathematical problem: Among ail functions 
u(x, y) that are continuously différentiable on the domain B and satisfy 
condition (8) on the boundary, find the one which gives the least value 
to the intégral (7). 

Extrême values of functionals and the calculus of variations. These 
examples allow us to form some impression of the kind of problems 
considered, but to define exactly the position of the calculus of variations 
in mathematics, we must become acquainted with certain new concepts. 
We recall that one of the basic concepts of mathematical analysis is 
that of a function. In the simplest case the concept of functional depend- 
ence may be described as follows. Let M be any set of real numbers. 
If to every number x of the set M there corresponds a number y, we say 
that there is defined on the set M a function y = /{*). The set M is often 
called the domain of définition of the function. 

The concept of a functional is a direct and natural generalization of 
the concept of a function and includes it as a spécial case. 
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Let M be a set of objects of any kind. The nature of these objects is 
immaterial at this time. They may be numbers, points of a space, curves, 
functions, surfaces, States or even motions of a mechanical System. For 
brevity we will call them éléments of the set M and dénoté them by the 
letter jc. 

If to every element x of the set M there corresponds a number y, we 
say that there is defined on the set M a functional y = F(x). 

If the set M is a set of numbers x, the functional y = f\x) will be a 
function of one argument. When M is a set of pairs of numbers (x,, x 2 ) 
or a set of points of a plane, the functional will be a function y = F(x 1 , x 2 ) 
of two arguments, and so forth. 

For the functional y = F(x), we State the following problem: 

Among ail éléments x of M find that element for which the functional 
y — f\x) has the smallest value. 

The problem of the maximum of the functional is formulated in the 
same way. 

We note that if we change the sign in the functional F(x) and consider 
the functional — F{x), the maximum (minimum) of F(x) becomes the 
minimum (maximum) of — f{x). So there is no need to study both 
maxima and minima; in what follows we will deal chiefly with minima 
of functionals. 

In the problem of the curve of fastest descent, the functional whose 
minimum we seek will be the intégral (3), the time of descent of a material 
point along a curve. This functional will be defined on ail possible functions 
(1), satisfying condition (2). 

In the problem of the position of equilibrium of a membrane, the 
functional is the potential energy (7) of the deformed membrane, and 
we must find its minimum on the set of functions u(x, y) satisfying the 
boundary condition (8). 

Every functional is defined by two factors: the set M of éléments x 
on which it is given and the law by which every element x corresponds 
to a number, the value of the functional. The methods of seeking the 
least and greatest values of a functional will certainly dépend on the 
properties of the set M. 

The calculus of variations is a particular chapter in the theory of 
functionals. In it we consider functionals given on a set of functions, 
and our problem consists of the construction of a theory of extreme 
values for such functionals. 

This branch of mathematics became particularly important after the 
discovery of its connection with many situations in physics and mechanics. 
The reason for this connection may be seen as follows. As will be made 
clear later, it is necessary, in order that a function provide an extreme 



124 


VIII. THE CALCULUS O F VARIATIONS 


value for a functional, that it satisfy a certain differential équation. 
On the other hand, as was mentioned in the chapters describing dif¬ 
ferential équations, the quantitative laws of mechanics and physics are 
often written in the form of differential équations. As it turned out, 
many équations of this type also occurred among the differential équations 
of the calculus of variations. So it became possible to consider the' équa¬ 
tions of mechanics and physics as extremal conditions for suitable 
functionals and to State the laws of physics in the form of requiring an 
extreme value, in particular a minimum, for certain quantities. New points 
of view could thus be introduced into mechanics and physics, since 
certain laws could be replaced by équivalent statements in terms of 
“minimal principles.” This in turn opened up a new method of solving 
physical problems, either exactly or approximately, by seeking the minima 
of corresponding functionals. 


§2. The Differential Equations of the Calculus of Variations 

The Euler differential équation. The reader will recall that a necessary 
condition for the existence of an extreme value of a différentiable function 
/ at a point x is that the dérivative /' be equal to zéro at this point: 
/'(*) = 0; or what amounts to the same thing, that the differential of 
the function be equal to zéro here: df — f\x)dx = 0. 

Our immédiate goal will be to find an analogue of this condition in the 
calculus of variations, that is to say, to set up a necessary condition 
that a function must satisfy in order to provide an extreme value for a 
functional. 

We will show that such a function must satisfy a certain differential 
équation. The form of the équation will dépend on the kind of functional 
under considération. We begin with the so-called simplest intégral of the 
calculus of variations, by which we mean a functional with the following 
intégral représentation: 


f(y) = / ’ F(x, y, y') dx. 

X l 


(9) 


The function F, occuring under the intégral sign, dépends on three 
arguments (x, y, y'). We will assume it is defined and is twice continuously 
différentiable with respect to the argument ÿ for ail values of this 
argument, and with respect to the arguments x and y in some domain B 
of the Oxy plane. Below it is assumed that we always remain in the 
interior of this domain. 



§2. THE DIFFERENTIAL EQUATIONS 


125 


It is clear that y is a function of x 

y = y(x), (10) 

continuously différentiable on the seg¬ 
ment x, < x < x 2 , and that y is its 
dérivative. 

Geometrically the function y(x) may 
be represented on the Oxy plane by a 
curve / over the interval [x,, x*] 

(figure 3). 

The intégral (9) is a generalization of 
the intégrais (3) and (6), which we 
encountered in the problem of the curve of fastest descent and the 
surface of révolution of least area. Its value dépends on the choice of 
the function y(x) or in other words of the curve /, and the problem of 
its minimum value is to be interpreted as follows: 

Given some set M of functions (10) (curves /); among these we must 
find that function (curve /) for which the intégral I(y) has the least value. 

We must first of ail define exactiy the set of functions M for which 
we will consider the value of the integra! (9). In the calculus of variations 
the functions of this set are usually called admissible for comparison. 
We consider the problem with fixed boundary values. The set of admissible 
functions is defined here by the following two requirements: 

1. y{x) is continuously différentiable on the segment [x,, x*]; 

2. At the ends of the segment X x ) has values given in advance 

.K*.) = yi . Xxù = y %• (il) 

Otherwise the function y(x) may be completely arbitrary. In the language 
of geometry, we are considering ail possible smooth curves over the 
interval [x, , x*], which pass through the two points A(x t , .y,) and 
B(x t ,yi) and can be represented by the équation (10). The function 
giving the minimum of the intégral will be assumed to exist and we will 
call it ,y(x). 

The following simple and ingenious arguments, which can often be 
applied in the calculus of variations, lead to a particularly simple form 
of the necessary condition which j<x) must satisfy. In essence they allow 
us to reduce the problem of the minimum of the intégral (9) to the problem 
of the minimum of a function. 

We consider the family of functions dépendent on a numerical para- 
meter a, 



Fig. 3. 


ÿ(x) = Xx) + ccrj(x). 


(12) 
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In order that ÿ(x) be an admissible function for arbitrary a, we must 
assume that tj(x) is continuously différentiable and vanishes at the ends 
of the interval [x,, x*]. 

y(xi) = y(*t) = 0. (13) 

The intégral (9) computed for y will be a function of the parameter a 

I(y) = J * F (x, y + ar,, y + a v ') dx = 0(a). * 


Since y(x) gives a minimum to the value of the intégral, the function 
0(a) must hâve a minimum for a = 0, so that its dérivative at this point 
must vanish 

tf'(0) = J ’ [f,(jr. y. ÿ) r) + F ¥ ,(x, y, y') rj'} dx = 0. (14) 

Z 1 

This last équation must be satisfied for every continuously différentiable 
function tj(x) which vanishes at the ends of the segment [x,, x 2 ]. In order 
to obtain the resuit which follows from this, it is convenient to transform 
the second term in condition (14) by intégration by parts 

J ’ F v .r)' dx = - j ‘ v dx 

*i *1 

so that condition (14) takes the new form 

<P'(0) = J 1 ’ (F v - ±F y ) r, dx = 0. (15) 

*i 

It may be shown that the following simple lemma holds. 

Let the following two conditions be fulfilled: 

1. The function f{x) is continuous on the interval [a, b]\ 

2. The function tj(x) is continuously différentiable on the interval 
[a, b] and vanishes at the ends of this interval. 

If for an arbitrary function tj(x) the intégral jt/(x) tj(x) dx is equal 
to zéro, then it follows that /(x) = 0. 


* The différence ÿ — y = orq is calied the variation (change) of the function y and 
is denoted by 8y, and the différence l(ÿ) — l(y ) is calied the total variation of the 
Integra! (9). Hence we get the name calculus of variations. 
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For let us assume that at some point c the function /is different from 
zéro and show that then a function q(x) necessarily exists for which 
JÎ/W v(x) dx ^ 0, in contradiction to the 
condition of the lemma. 

Since/(c) 0 and/is continuous, there 
must exist a neighborhood [a, /3] of c in 
which / will be everywhere different from 
zéro and thus will hâve a constant sign 
throughout. 

We can always construct a function 
r](x) which is continuously différentiable 
on [o, b], positive on [a, /3], and equal to 
zéro outside of [a, /S] (figure 4). 


O 


JL 

“ C /3 b 


Fig. 4. 


Such a function q(x), for example, is defined by the équations 



1° 

on 

[û. *], 

V(x) = 

\(X - «)\P - X )’ 

on 

t«, /3]. 


:o 

on 

W. b). 


But for such a function q(x) 

f'fn dx = Çfy dx. 


The latter of these intégrais cannot be equal to zéro since, in the interior 
of the interval of intégration, the product fy is different from zéro and 
never changes its sign. 

Since équation (15) must be satisfied for every q(x) that is continuously 
différentiable and vanishes at the ends of the segment [jc, , x 2 ], we may 
assert, on the basis of the lemma, that this can occur only in the case 


F, 



= 0 , 


(16) 


or, by computing the dérivative with respect to x 

F y(x. y> /) — F ry .(x, y, ÿ) — F„,(x, y, f)ÿ — F, ly .(x, y, y')y =0. (17) 

This équation is a differential équation of the second order with respect 
to the function y. It is called Euler’s équation. 

We may State the following conclusion. 

If a function y(x) minimizes the intégral I(y), then it must satisfy 
Euler’s differential équation (17). In the calculus of variations, this last 
statement has a meaning completely analogous to the necessary condition 
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df = 0 in the theory of extreme values of functions. It allows us im- 
mediately to exclude ail admissible functions that do not satisfy this 
condition, since for them the intégral cannot hâve a minimum, so that 
the set of admissible functions we need to study is very sharply reduced. 

Solutions of équation (17) hâve the property that for them the dérivative 
[ (d/d<x)l(y + vanishes for arbitrary rj(x), so that they are analo- 

gous in meaning to the stationary points of a function. Thus it is often 
said that for solutions of (17) the intégral I(y) has a stationary value. 

In our problem with fixed boundary values, we do not need to find 
ail solutions of the Euler équation but only those which take on the 
values y t , y 2 at the points x,, x % . 

We turn our attention to the fact that the Euler équation (17) is of 
the second order. Its general solution will contain two arbitrary constants 


y = 4>{x, C,, C t ). 


These must be defined so that the intégral curve passes through the points 
A and B, so we hâve the two équations for finding the constants C, and C* 

< K X î > Q • Q = y x , <ft x j. C,, Cj) = y t . 

In many cases this System has only one solution and then there will 
exist only one intégral curve passing through A and B. 

The search for functions giving a minimum for this intégral is thus 
reduced to the solution of the following boundary-value problem for 
diflferential équations: On the interval [x,, x 2 ] find those solutions of 
équation (17) that hâve the given values y, , y t at the ends of the interval. 

Frequently this last problem can be solved by using known methods 
in the theory of difierential équations. 

We emphasize again that every solution of such a boundary-value 
problem can provide only a suspected minimum and that it is necessary 
to verify whether or not it actualiy does give a minimum value to the 
intégral. But in particular cases, especially in those occurring in the 
applications, Euler’s équation completely solves the problem of finding 
the minimum of the intégral. Suppose we know initially that a function 
giving a minimum for the intégral exists, and assume, moreover, that 
the Euler équation (17) has only one solution satisfying the boundary 
conditions (11). Then only one of the admissible curves can be a suspected 
minimum, and we may be sure, under these circumstances, that the 
solution found for the équation (17) indeed gives a minimum for the 
intégral. 

Example. It was previously established that the problem of the curve 
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of fastest descent may be reduced to finding the minimum of the intégral 

J o vy 

among the set of functions satisfying the boundary conditions 

MO) = 0, y(x 2 ) = y 2 . 

In this problem 



Euler’s équation has the form 

- - y-*/» Vl + v ; * - — fy -,/ï — -p===r —1 = 0 . 

2 dx f V 1 + ÿ* * 

After some manipulation it takes the form 

2 y" _ i 

i +/• y 

Multiplying both sides of the équation by ÿ and integrating, we get 
ln(l + y '*) = — In y + ln k, 



Now letting 

k k 

y = 2 0 — cos«), rfv = 2 sinudu ’ 

we find after substituting and simplifying 

Jç 

2 (1 — cos u) du = ± dx, 

from which, by integrating, we get: jc = ± k/2 (u — sin u) + C. Since 
the curve must pass through the ori^n, it follows that we must put C = 0. 
In this way we see that the brachistochrone is the cycloid 

k fa 

X = 2 (“ — sin “). y = 2 0 — cos ")• 
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The constant k must be found from the condition that this curve passes 
through the point A/ 2 (x 2 , y t ). 

Functionals depending on several functions. The simplest functional 
in the calculus of variations (17) depended on only one function. In the 
applications such functionals will occur in those cases where the objects 
(or their behavior) are defined by only one functional dependence. For 
example, a curve in the plane is defined by the dependence of the ordinate 
of a point on its abscissa, the motion of a material point along an axis 
is defined by the dependence of its coordinate on time, etc. 

But we must often deal with objects that cannot be defined so simply. 
In order to define a curve in space, we must know the functional de¬ 
pendence of two of its coordinates on the third. The motion of a point 
in space is defined by the dependence of its three coordinates on time, 
etc. Study of these more complicated objects leads to variational problems 
with several varying functions. 

We will restrict ourselves to cases in which the functional dépends on 
two functions y(x) and z( x), since the case of a larger number of functions 
does not diflfer in principle from this one. 

We consider the following problem. Admissible pairs of functions 
y(x) and z(x) are defined by the conditions: 

1. The functions 

y = y(x), z = z(x) (18) 

are continuously différentiable on the segment [x,, xJ; 

2. At the ends of the segment these functions hâve given values 

y(x j) = yi . Ax t ) = y t , 

Z(x t ) = z,, z(x t ) = z 2 . (19) 

Among ail possible pairs of functions y(x) and z(x), we must find the 
pair that gives the least value to the intégral 

I(y, z) = f’ F(x, y, z, y', z') dx. (20) 


In the three-dimensional space x, y, z, each pair of admissible functions 
will correspond to a curve /, defined by équations (18) and passing through 
the points 

M 1 (x ï , y,, z,), A/ 2 (x 2 , y t , z 2 ). 

We must find the minimum of the intégral (20) on the set of ail such 
curves. 
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We assume that the pair of fonctions giving the minimum of the intégral 
(20) exists, and we will cail these functions y(x) and z(x). Together with 
them we consider a second pair of functions 

S = y + *r](x), z = z + <*£(*), 

where ^(x) and £(x)are anycontinuously différentiable functions vanishing 
at the ends x,, x 2 of the segment; ÿ, z will also be admissible, and for 
a = 0 they will coincide with the functions y, z. We substitute them in (20) 


i(ÿ, *) — J * F ( x < y + <»?. * + «£. / + a v'< z' + «O dx = <P(a). 

*i 

The intégral so derived will be a function of a. Since ÿ and z coincide 
with y and z when a = 0, the function <Z>(a) must hâve a minimum for 
a = 0. But at a minimum point the dérivative of 0 must vanish 

<P'(0) = 0. 

Computing the dérivative gives 



rj +F,-i + F,,rj' + F,,ndx = 0, 


or, if the terms in -q' and Ç' are integrated by parts 

P [( F * “ & F ') V(X) + { F ‘~Tx F *') M dx = °- 


This last équation must be satisfied for any two continuously différentiable 
functions rj(x) and £(x) vanishing at the ends of the interval. Hence, from 
the basic lemma proved earlier, the following two conditions must be 
fulfilled: 



( 21 ) 


Hence, if the functions y, z give a minimum for the intégral (20), they 
must satisfy the System of Euler differential équations (21). 

This resuit again allows us to replace a variational problem for the 
minimum of the intégral (20) by a boundary-value problem in the theory 
of differential équations: On the interval [x,, x 2 ], we must find those 
solutions y, z of the System of differential équations (21) that satisfy 
the boundary conditions (19). 
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As in the preceding case, this opens up a possible path for the solution 
of the minimal problem. 

As an example of an application of the Euler System (21), let us consider 
the variational principle of Ostrogradskil-Hamilton in Newtonian me- 
chanics. We restrict ourselves to the simplest form of this principle. 

We consider a material body of mass m and assume that the dimensions 
and form of the body may be ignored, so that we may consider it as a 
material point. 

We assume that the point moves from its position A/,(x, , .y,, z,) at 
time f, to the position M 2 (x 2 , y 2 , z 2 ) at time t 2 . We also assume that 
the motion occurs under the laws of Newtonian mechanics and is caused 
by application of a force F(x, y, z, t) which dépends on the position of 
the point and on the time t and possesses a potential function 0(x, y, z, t). 
This last condition means the following: the components F x , F v , F, of 
the force F along the coordinate axes are the partial dérivatives of a 
function U with respect to the corresponding coordinates 

_ eu a(/ at/ 

1 8x ’ * dy * * 8z ' 

We assume the motion to be free, that is, not subject to any kind of 
constraints.* 

The équations of motion of Newton are 

d'x 8U d'y 80 cFz 80 

m dt 2 8x m dt' 8y' m dt 2 8z 

If the point obeys the laws of Newtonian mechanics, it moves in a 
completely determined manner. But together with these “Newtonian 
motions” of the point, let us consider other (non-Newtonian) motions, 
which for brevity we will call “admissible,” and which will be defined 
by two requirements only, that at time f, the point is in the position A/, 
and at time t 2 is in the position M 2 . 

How can we distinguish the “Newtonian motion” of the point from 
these other “admissible” motions? Such a possibility is given by the 
Ostrogradskil-Hamilton principle. 

We introduce the kinetic energy of the point 

T = im(x'* + y'* + z' 2 ) 


* This is not essential for the Ostrogradskii-Hamilton principle; We may impose 
any restraints we like on the mechanical System, even nonstationary ones, provided 
only that they are holonomie, i.e., that they may be described in the form of équations 
not containing dérivatives of the coordinates with respect to time. 
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and form the so-called action intégral 

/= Ç\T+U)dt. 
h 


The principle States: The “Newtonian motion” of the point is distin- 
guished among ail its “admissible” motions by the fact that it gives the 
action intégral a stationary value. 

The action intégral / dépends on three functions: x(l), y(t), z(t). 
Since for ail the motions under comparison the initial and final positions 
of the point are identical, the boundary values of these functions are 
fixed. We are dealing here with a variational problem for three varying 
functions with fixed values at the ends of the interval [/, , / 2 ], 
Previously we agreed to say that the intégral (17) has a stationary 
value for any curve which is an intégral curve of the Euler équation. 
In our problem we are integrating a function 

F = T + U = £/*(*'* + /* + z'*) + U(x, y, z, t) 

which dépends on three functions, so that for a stationary value of the 
intégral we must satisfy the System of three differential équations 


F x — 

Fy~ 

F, - 



Since F x = dU/dx, F x , = mx', ••• , the System of Euler équations is 
identical with the équations of motion of Newtonian mechanics, which 
provides a vérification of the Ostrogradskil-Hamilton principle. 


The minimum problem for a multiple intégral. The last problem in 
the calculus of variations to which we wish to draw the attention of the 
reader is the problem of minimizing a multiple intégral. Since the facts 
connected with the solution of such problems are similar for intégrais 
of any multiplicity, we will confine ourselves to the simplest case, that 
of double intégrais. 

Let B be a domain in the Oxy plane, bounded by the contour I. The set 
of admissible functions is defined by the conditions: 

1. u(x, y) is continuously différentiable on the domain B, 

2. On / the function u takes given values 

«I, =AM). 


( 22 ) 
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Among ail functions we must find the one which gives a minimum 
value for the intégral 

I(u) = JJ F(x, y, u, u,, u v ) dx dy. (23) 

B 

The given boundary values (22) for the function u in the space (x, y, u) 
détermine a given space curve r, lying above / (cf. figure 2, Chapter VII). 

We consider ail possible surfaces S passing through Tand lying above B. 
Among these we want to find the one for which the intégral (23) is minimal. 

As before, we assume the existence of the minimizing function and 
dénoté it by u. At the same time we consider another function 

ü = u + ari(x, y), 

where tj(x, ÿ) is any continuously différentiable function vanishing on /. 
Then the function 


Iifi) =|| F(x, y, u + oc V , u z + <xr/ x , u v + ooj,) dx dy = <P{oc) 

B 

must hâve a minimum for a = 0. In this case its first dérivative must be 
equal to zéro for a = 0 

0'(O) = 0, 
or 

J J (FuV + Fur,, + F u rjy) dx dy = 0. (24) 

B 

We transform the last two terms by Ostrogradskifs formula 
JJ (.Fu.Vx + F UtV y) dx dy 

B 

= SJ [h {Fu ’ ri) + Vy dx dy - SJ (i f “* + Ty Fu ’) 11 dx dy 

= ) [F Ui cos ( n , x) + F Ut cos ( n , y)) r, ds 
J i 

- fJ(-é F “ + ê F '‘') iidxdy - 

The contour intégral along / must vanish, since on the contour / the 
function r, is equal to zéro, so that condition (24) may be put in the form 

SJ( F '-Tx F **'h F ^ r,dxdy = () - 
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This équation must be satisfied for every function 17 which is continuously 
différentiable and vanishes on the boundary /. 

We may conclude, as before, that ail points of the domain B the équation 


Fu- 



= 0 


(25) 


must be satisfied. 

So if the function u gives a minimum for the intégral (23), it must 
satisfy the partial difïerential équation (25). 

As in ail the preceding problems, we hâve here established a connection 
between a variational problem of minimizing an intégral and a boundary- 
value problem for a differential équation (in this case partial). 


Example. The displacement u(x, y) of points of a membrane with a 
deformed boundary is to be found from the condition of the minimum 
of the potential energy 

f jj(ul + *)dxdy 

B 


for the given boundary values u |,. = <f>. 

Omitting, for simplicity, the constant factor fi, we may set 

f = I(«4 + «4), 


so that équation (25) has the form 


or 




= 0 , 


dhi dhi . 

Au =--= 0. 

dx 2 dy 2 


Thus the problem of determining the displacement of the points of a 
membrane has been reduced to that of finding a harmonie function u 
with given values on the boundary of the domain (cf. Chapter VI, §3). 


§3. Methods of Approximate Solution of Problems in the 
Calculus of Variations 

We conclude the présent chapter with an indication of the ideas involved 
in some of the approximation methods in the calculus of variations. 
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For definiteness we discuss the simplest functional 
tty) = J** F(x, y , y) dx 

z i 

for fixed boundary values of the admissible functions. 

Let y(x) be an exact solution of the problem of minimizing /, with 
m — I(ÿ) the corresponding minimal value of the intégral. It would 
appear that if we détermine an admissible function ÿ for which the value 
of the intégral /( ÿ) is very near to m, we may assume that ÿ will also 
differ little from the exact solution y. Moreover, if we are able to construct 
a sequence of admissible functions ÿ x ,ÿ t , ••• for which I(ÿ„) —► m, we 
may expect that such a sequence will converge in some sense or other 
to the solution y, so that computation of ÿ n with sufliciently large index 
will allow us to find the solution to any desired degree of accuracy. 

Depending on how we go about choosing the “minimizing sequence” 
ÿ„(n = 1 , 2 , •••), we will hâve one oranotherof the various approximation 
methods in the calculus of variations. 

Historically, the first of these was the method of broken lines, or 
Euler’s method. We décomposé the interval [x,, x*] into a number of 
segments. For example, if we choose these segments of equal length, 
the points of division will be 

x,, x, + h, x, + 2 h, —, x, + nh = x 2 , h = X - - — . 

We now construct the broken line p„_, with vertices lying above the points 
of division. The ordinates of the vertices we dénoté by 

6 0 , 6 ,, b t , ••• , b„_,, b„ 

and require that this broken line begin and end at the same points as 
the admissible curves, so that b„ = y, and b n = >> 2 . Then the broken 
line will be defined by the ordinates 

b x , b t , •••, b n _,. 

The question now is to find out how to choose the broken line 
(i.e., the ordinates b { of its vertices) so as to approximate as closely as 
possible the exact solution of the problem. 

To achieve this object it is natural to proceed as follows. We compute 
the intégral / for the broken line. Its value will dépend on the b { 


t(Pn- 1 ) = ,b t , — , bn-J 
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and will therefore be a function of these ordinates. We now choose the 
b, so that they give /(/>„_,) a minimum value. To define these b { we will 
hâve the System of équations 

^= 0 (/= 1, 2, •••, n — 1). 

Since any admissible curve, and in particular the exact solution of the 
problem, may be approximated by broken lines with any desired accuracy, 
both in its position on the plane and in the directions of its tangents, 
it is clear that the sequence of broken lines thus constructed will, 
in fact, be a minimizing sequence. By taking n sufficiently large, we may 
expect to approximate the solution with any desired degree of accuracy 
over the whole interval [x,, xJ. Of course, the fact of convergence must 
be investigated in each case. 

The following method, which is very convenient for calculation, is 
widely used in physics and technology. 

We choose any function <£ 0 (x) satisfying the boundary conditions 
^o(*i) — >i and </> 0 (Xi) = y t , and a sequence of functions <£,(x), <f> 2 (x), 
vanishing at the ends of the interval [x,, xJ. 

We then form the linear combination 

*n(x) = </> 0 (x) + û,^>,(x) + — + a„4> n (x). 

For arbitrary values of the numerical coefficients a,, a t , ••• , a n , the 
function s„(x) will be admissible. 

Replacing y by s n (x) in the intégral / and making the necessary computa¬ 
tions, we obtain a certain function of the coefficients a, . 

We now choose the a f so that this function has the least possible value. 
The coefficients must be found from the System 

= 0 (/= 1,2, —, n). 

Solving this System, we obtain, in general, the values of the coefficients 
a,, ••• , û„producing a minimum value for I(s„) and with them weconstruct 
an approximation to the solution 

I n (x) = <f>„(x) + û,<£,(x) + ••• + â„<t> n (x). 

The sequence of approximations s„ (n = 1, 2, •••) constructed in this 
way will not be a minimizing sequence for arbitrary choice of the func¬ 
tions i j>(. The necessary condition for it to be so is that the sequence 
of functions <f> ( satisfy a certain condition of “completeness” which we 
will not define here. 
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CHAPTER 


IX 


FUNCTIONS 
OF A COMPLEX VARIABLE 


§1. Complex Numbers and Functions of a Complex Variable 

Complex numbers and their significance in algebra. Complex numbers 
were introduced into mathematics in connection with the solution of 
algebraic équations. The impossibility of solving the algebraic équation 

x* + I = 0 (1) 

in the domain of reaJ numbers led to the introduction of a conventional 
number, the imaginary unit /, defined by the équation 

t 1 = -1- (2) 

Numbers of the form a + b!, where a and b are real numbers, were 
called complex numbers. These numbers were manipulated like real 
numbers, being added and muitiplied as binomials. If we also make use 
of équation (2), the basic operations of arithmetic when carried out on 
complex numbers produce other complex numbers.* The division of 
complex numbers being defined as the inverse of multiplication, it turns out 
that this operation also is uniquely defined, provided only that the 
denominator is not equal to zéro. In this manner, the introduction of 
complex numbers first brought to light the interesting, though for the 
time being purely formai, fact that in addition to the real numbers there 
exist other numbers, the complex ones, on which ail the arithmetic opera¬ 
tions can be performed. 


* Complex numbers are known to the reader from secondary school. See also 
Chapter IV, §3. 
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The next step consists of the géométrie représentation of complex 
numbers. Every complex number a -f bi may be represented by a point 
in the Oxy plane with coordinates (a, b), or by a vector issuing from the 
origin to the point (a, b). This led to a new point of view concerning 
complex numbers. Complex numbers are pairs (a, b) of real numbers 
for which there are established définitions of the operations of addition 
and multiplication, obeying the same laws as for real numbers. Here we 
discover a remarkable situation: The sum of two complex numbers 

(a + bi) + (c + di) = (a + c) + (b + d)i 

is represented geometrically by the diagonal of the parallelogram con- 
structed from the vectors representing the summands (figure 1). In this 

way, complex numbers are added by the 
same law as the vector quantifies found in 
mechanics and physics: forces, velocifies, 
and accélérations. This was a further 
reason for considering that complex 
numbers are not merely formai generaliza- 
tions but may be used to represent actual 
physical quantifies. 

We will see later how this point of view 
is very successful in various problems of 
mathematical physics. 

However, the introduction of complex 
numbers had its first successes in the discovery of the laws of algebra and 
analysis. The domain of real numbers, closed with respect to arithmetic 
operations, was seen to be not sufficiently extensive for algebra. Even 
such a simple équation as (1) does not hâve a root in the domain of real 
numbers, but for complex numbers we hâve the following remarkable 
fact, the so-called fundamental theorem of algebra: Every algebraic 
équation 

z" + a,*" -1 + - + a„_,z + o„ = 0 

with complex coefficients has n complex roots.* 

This theorem shows that the complex numbers form a system of 
numbers which, in a well-known sense, is complété with respect to the 
operations of algebra. It is not at ali trivial that adjoining to the domain 
of real numbers a root of the single équation (1) leads to the numbers 
a + bi in whose domain any algebraic équation is solvable. The funda¬ 
mental theorem of algebra showed that the theory of polynomials, even 



Cf. Chapter IV, §3. 
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with real coefficients, may be given a finished form only when we consider 
the values of the polynomial in the whole complex plane. The further 
development of the theory of algebraic polynomials supported this point 
of view more and more. The properties of polynomials are discovered 
only by considering them as functions of a complex variable. 

Power sériés and functions of a complex variable. The development 
of analysis brought to light a sériés of facts showing that the introduction 
of complex numbers was significant not only in the theory of polynomials 
but also for another very important class of functions, namely those which 
are expandable in a power sériés 

/(*) = a 0 + û,(x - fl) 4- a 2 (x - a)* + (3) 

As was already mentioned in Chapter II, the development of the infini¬ 
tésimal analysis required the establishment of a more précisé point of 
view for the concept of a function and for the various possibilités of 
defining functions in mathematics. Without pausing here to discuss these 
interesting questions, we recall only that at the very beginning of the 
development of analysis it turned out that the most frequently encountered 
functions could be expanded in a power sériés in the neighborhood of 
every point in their domain of définition. For example, this property 
holds for ail the so-called elementary functions. 

The majority of the concrète problems of analysis led to functions that 
are expandable in power sériés. On the other hand, there was a desire to 
connect the définition of a “mathematical” functions with a “mathema- 
tical" formula, and the power sériés represented a very inclusive kind 
of “mathematical” formula. This situation even led to serious attempts 
to restrict analysis to the study of functions that are expandable in 
power sériés and thus are called analytic functions. The development 
of science showed that such a restriction is inexpedient. The problems of 
mathematical physics began to extend beyond the class of analytic 
functions, which does not even include, for example, functions represented 
by curves with a sharp corner. However, the class of analytic functions, 
in view of its remarkable properties and numerous applications, proved 
to be the most important of ail the classes of functions studied by mathe- 
maticians. 

Since the computation of each term of a power sériés requires only 
arithmetic operations, the values of a function represented by a power 
sériés may be computed also for complex values of the argument, at 
least for those values for which the sériés is convergent. When we thus 
extend the définition of a function of a real variable to complex arguments, 
we speak of the “continuation” of the function into the complex domain. 
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Thus an analytic function, in the same way as a polynomial, may be 
considered not only for real values of the argument but also for complex. 
Further, we may also consider power sériés with complex coefficients. 
The properties of analytic functions, as also of polynomials, are fully 
revealed only when they are considered in the complex domain. To illustrate 
we turn now to an example. 

Consider the two functions of a real variable 


e* and 


1 

1 +x î ' 


Both these functions are finite, continuous, and différentiable an arbitrary 
number of times on the whole axis Ox. They may be expanded in a 
Taylor sériés, for example, around the origin x = 0 


X x i 

eZ=l+ lT + ü- + ~’ 

= 1 — X* + X* — X* + 


1 +** 


(4) 

(5) 


The first of the sériés so obtained converges for ail values of x, while 
the second sériés converges only for —1 < x < +1. Considération of 
the function (5) for real values of the argument does not show why its 
Taylor sériés diverges for \ x \ ^ 1. Passing to the complex domain 
allows us to clear up the situation. We consider the sériés (5) for complex 
values of the argument 

1 — z 1 + z* — z* + —. (6) 

The sum of n terms of this sériés 


5„ = 1 - 2* + z* - z* + - + (-l)"-»^"- 2 
is computed in the same way as for real values of z: 

s„ + z 2 s n = 1 + (—1 )"**", 


hence 


1 +(- \)"z 2 ” 

1 +z s 

This expression shows that for | z \ < 1 


5n = 


lim s„ = 


1 

1 +z* ’ 
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since | z | 2n -*■ 0. Thus for complex z satisfying the inequality | z \ < 1 
the sériés (6) converges and has the sum 1/(1 + z 2 ). For | z | > 1 the 
sériés (6) diverges, since in this case the différence — j„_, = (—l) n-I z 2 " -2 
does not converge to zéro. 

The inequality | z | < 1 shows that the point z is located at a distance 
from the origin which is less than one. Thus the points at which the 
sériés (6) converges form a circle in the complex plane with center at 
the origin. On the circumference of this circle there lie two points l and 
—/ for which the function 1/(1 + z 2 ) becomes infinité; the presence of 
these points détermines the restrictions on the domain of convergence 
of the sériés (6). 

The domain of convergence of a power sériés. The domain of con¬ 
vergence of the power sériés 

o 0 + o t (z - a) + fl 2 (z - n) 2 + - + a„(z — a) n + ••• (7) 

in the complex plane is always a circle with center at the point a. 

Let us prove this proposition, which is called Abel's theorem. 

First of ail we note that a sériés whose terms are the complex numbers w n 

», + »« + ~ + -F -, (8) 

may be considered as two sériés, consisting of the real parts and the 
imaginary parts of the number w n = u n + iv n 

«fi + «a + - + + -, (9) 

»!+»*+— + »•+—. ( 10 ) 

A partial sum s n of the sériés (8) is expressed by the partial sums o„ 
and r„ of the sériés (9) and (10) 

S„ = On + ir„, 

so that convergence of the sériés (8) is équivalent to convergence of both 
the sériés (9) and (10), and the sum s of the sériés (8) is expressed by the 
sums a and t of the sériés (9) and (10) 

s = a + It. 

After these remarks the following lemma is obvious: 

If the terms of the sériés (8) are less in absolute value than the terms 
of a convergent géométrie progression 

A + Aq + ••• -F Acf 1 + — 

with positive A and q, where q < 1, then the sériés (8) converges. 
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For if | *v n | < Aq ", then 

I u n | = | w„ | < Aq ", 

I »n I = I h-, | < Aq ", 

so that (cf. Chapter II, §14) the sériés (9) and (10) converge and thus the 
sériés (8) also converges. 

We now show that if the power sériés (7) con¬ 
verges at some point z 0 , then it converges at ail 
points lying inside the circle with center at a and 
having z 0 on its boundary (figure 2). From this 
proposition it follows readily that the domain of 
convergence of the sériés (7) 

°o + û i(z — a) H-b o „(z - o) n + — 

Fig. 2. is either the entire plane, or the single point z = a, 

or some circle of finite radius. 

For let the sériés (7) converge at the point z„ ; then the general term 
of the sériés (7) for z = z 0 converges to zéro for n -* oo, and this means 
that ail the terms in the sériés (7) lie inside some circle; let A be the radius 
of such a circle, so that for any n 



|fl n (z 0 -o) n |< A. 


( 11 ) 


We now take any point z doser than z 0 to a and show that at the point z 
the sériés converges. 

Obviously 

| z - fl | < | z 0 — fl |, 

so that 


9 = 


I * - al 

*o - a I 


< 1 . 


( 12 ) 


Let us estimate the general term of the sériés (7) at the point z 


| fl„(z - fl)" | = 


fl„(z 0 - a)" (—-| = | fl„(z 0 - fl) n | ; 

' z 0 — fl ' I ' I z„ - a I ' 


from inequalities (11) and (12) it follows that 


| a„(z - a) n | < Aq"; 

i.e., the general term of the sériés (7) at the point z is less than the general 
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term of a convergent géométrie progression. From the basic lemma 
above, the sériés (7) converges at the point z. 

The circle in which a power sériés converges, and outside of which it 
diverges, will be calJed the circle of convergence ; the radius of this circle 
is calied the radius of convergence of the power sériés. The boundary of 
the circle of convergence, as may be shown, always passes through the 
point of the complex plane nearest to a at which the regular behavior 
of the function ceases to hold. 

The power sériés (4) converges on the whole complex plane; the power 
sériés (5), as was shown above, has a radius of convergence equal to one. 


Exponential and trigonométrie functions of a complex variable. A power 
sériés may serve to “continue” a function of a real variable into the 
complex domain. For example, for a complex value of z we define the 
function e* by the power sériés 

e, = ,+ Tr + ir + "- (13) 

ln like manner the trigonométrie functions of a complex variable are 
introduced by 

,in * “ TT _ ir + ir—• (l4) 

cosz=1 "Tr + 7r~‘"- (l5) 


These sériés converge on the whole plane. 

It is interesting to note the connection which occurs between the 
exponential and trigonométrie functions when we turn to the complex 
domain. 

If in (13) we replace z by /z, we get 


e*t 




Grouping everywhere the terms without the multiplier » and the terms 
with multiplier /, we hâve 

e* 1 = cos z + / sin z. (16) 

Similarly we can dérivé 

e~ il = cos z — i sin z. (16') 
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Formulas (16) and (16') are called Euler’s formulas. Solving (16) and 
(16') for cos z and sin z, we get 


cos z = 


é' + e~ 


sin z = 


e“ — e~ u 


07) 


21 


It is very important that for complex values the simple rule of addition 
of exponents continue to hold 


e*\ • e 1 * — e ‘> + **. (18) 

Since for complex values of the argument we define the function e‘ 
by the sériés (13), formula (18) must be proved on the basis of this 
définition. We give the proof: 


c ‘‘ = (l + JL + iî- + •••) • (l JL + 


We will carry out the multiplication of sériés termwise. The terms 
obtained in this multiplication of sériés may be written in the form of a 
square table 


1-1 + 1 





- +3_. i +J1..JÏ-+J1..A + J1..A + 

T 1! T 1! 1! 1! 2! T 1! 3! 


T 2 T 2 r t 2 T 2 7 2 r 3 

iî--l a-_£î_.-£ 2 -+3_._£ 2 -a.Jî_.-£L + 

2! T 2! 1! 2! 2! T 2! 3! 


... + IL. 1+ JL.1l + A.JL + JL 

3! 3! 1! 3! 2! 3! 



We now collect the terms which hâve the same sum of powers of z x 
and z t . It is easy to see that such terms lie on the diagonals of our table. 
We get 


= i + (3- + ii_) + (il- 4.3-3- + -£-) + - 


(19) 
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The general term of this sériés will be 


z * i z i z l | z 2 z l i_ i _£l_ 

ni ' (n — 1)! 1! (n-2)! 2! ni 


= — (z n 4-—- z n ~ l z H-—-z B_ * z* 4- ••• + z n ^ . 

ni r* ^ l!(n - 1)! * « ^ 2l(n - 2)1 s * + + >/ 


2!(n - 2)! 


Applying the binomial formula of Newton, we get the general term in 
the form 

(*i + zù n 
ni 


So the general term of the sériés (19) is identical with the general term 
of the sériés for e‘> +1 *, which proves the theorem on the rule for multi¬ 
plication (18). 

The multiplication theorem and Euler’s formula allow us to dérivé an 
expression for the function e‘ in terms of functions of real variables in 
finite form (without sériés). Thus, putting 


we get 
and since 
we find that 


z = x + iy, 

e , = gx+tv = gx . e i, t 

e ,v = cos y + / sin y, 
e * = e*(cos y + / sin y). 


( 20 ) 


The formula so derived is very convenient for investigating the proper- 
ties of the function e‘. We note two of its properties: (1) the function e 1 
vanishes nowhere; for in fact, e* ^£0 and the functions cos y and sin y 
in formula (20) never vanish simultaneously; (2) the function e‘ has 
period 2ni, i.e., 


This last statement follows from the multiplication theorem and the 
equality 

e*”' = cos 2n + i sin 2n = 1 . 

The formulas (17) allow us to investigate the functions cos z and sin z 
in the complex domain. We leave it as an exercise for the reader to prove 
that in the complex domain cos z and sin z hâve period 2 n and that the 
theorems about the sine and cosine of a sum continue to hold for them. 
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The general concept of a function of a complex variable and the differen- 
tiability of functions. Power sériés allow us to define analytic functions 
of a complex variable. However, it is of interest to study the basic 
operations of analysis for an arbitrary function of a complex variable 
and in particular the operation of différentiation. Here we uncover very 
deep-lying facts connected with the différentiation of functions of a 
complex variable. As we will see on the one hand, a function, having a 
first dérivative at ail points in a neighborhood of some point z 0 , necessarily 
has dérivatives of ail orders at z„, and further, it can be expanded in a 
power sériés centered at this point; i.e., it is analytic. Thus, if we consider 
différentiable functions of a complex variable, we retum immediately to 
the class of analytic functions. On the other hand, a study of the dérivative 
uncovers the géométrie behavior of functions of a complex variable and 
the connections of the theory of these functions with problems in mathe- 
matical physics. 

In view of what has been said, we will, in what follows, call a function 
analytic at the point z„ if it has a dérivative at ail points of some neighbor¬ 
hood of z 0 . 

We will say, following the general définition of a function, that a 
complex variable h> is a function of the complex variable z if some law 
exists which allows us to find the value of w, given the value of z. 

Every complex number z = x + iy is represented by a point ( x , y) on 
the Oxy plane, and the numbers w = u + ht will also be represented by 
points on an Ouv plane, the plane of the function. Then from the géométrie 
point of view a function of a complex variable w = f(z) defines a law 
of correspondence between the points of the Oxy plane of the argument z 
and points of the Ouv plane of the value w of the function. In other words, 
a function of a complex variable détermines a transformation of the 
plane of the argument to the plane of the function. To define a function 
of a complex variable means to give the correspondence between the pairs 
of numbers (x, y) and (u, v ); defining a function of a complex variable 
is thus équivalent to defining two functions 

« = <f>(x, y), v = <p(x, y), 

for which, obviously 

w = u + iv = <Kx, ÿ) + *Kx, y). 

For example, if 

w = z 2 = (x + iy)* = x* — y* + 2 Ixy, 

then 

u = <t>(x, ÿ) = x* - f, v = <A(x, y) = 2 xy. 
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The dérivative of a function of a complex variable is defined formally in 
the same way as the dérivative of a function of a real variable. The dériva¬ 
tive is the limit of the différence quotient of the function 

m - jw, (2.) 

if this limit exists. 

If we assume that the two real functions u and v, making up w = f{z ), 
hâve partial dérivatives with respect to x and y, this is still not a sufficient 
condition that the dérivative of the function J\z) exists. The limit of the 
différence quotient, as a rule, dépends on the direction in which the 
points z' — z A z approximate the point z (figure 3). For the existence 
of the dérivative f'(z), it is necessary that the limit does not dépend on 
the manner of approach of z' to z. Consider, for example, the case when 
z' approaches z parallel to the axis Ox or parallel to the axis Oy. 

In the first case * + 

Az = Ax, ]/ Ay 

f\z + Az) -f(z) = u(x + Ax, y) - u(x, y) 

+ l[v(x + Ax, y) - v(x, y) J, 

and the différence quotient 


A* 

Fig. 3. 


f{z + Az)-f(z) 
Az 


u(x + Ax, y) - u(x, y) v(x + Ax, y) - v(x, y) 
Ax Ax 


for Ax-*- 0 converges to 


du .dv 
dx + 'dx‘ 


( 22 ) 


In the second case 

Az = i Ay, 


and the différence quotient 

Az+Az) -f\z) = _. u(x, y + Ay)- u(x, y) v(x, y + Ay) - v(x, y) 
Az Ay Ay 


leads in the limit to 


dv .du 
dy ' dy' 


(23) 
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If the fonction w = f(x) has a dérivative, these two expressions must 
be equal, and thus 


du _ dv 
dx dy' 

du _ dv 
dy dx ' 


(24) 


Satisfying these équations is a necessary condition for the existence 
of the dérivative of the fonction w = u + h). It can be shown that 
condition (24) is not only necessary but also sufficient (if the fonctions u 
and v hâve a total differential). We wili not give a proof of the sufficiency 
of conditions (24), which are called the Cauchy-Riemann équations. 

It is easy to establish the fact that the usual rules for differentiating 
fonctions of a real variable carry over without alteration to fonctions of 
a complex variable. Certainly this is true for the dérivative of the fonction z n 
and for the dérivative of a sum, a product, or a quotient. The method 
of proof remains exactly the same as for fonctions of a real variable, 
excepting only that in place of real quantifies, complex ones are to be 
understood. This shows that every polynomial in z 

w = a 0 + a t z + - + a„z n 


is an everywhere différentiable fonction. Any rational fonction, equal to 
the quotient of two polynomials 

h, = flo+fl,z+ ••+QnZ" 
b 0 + b x z + — + b n z” 

is différentiable at ail points where the denominator is not zéro. 

in order to establish the differentiability of the fonction w = e 1 , we 
may use the Cauchy-Riemann conditions. In this case, on the basis of 
formula (20) 

u = e x cos y, v = e* sin y; 

we substitute these fonctions in (24) and show that the Cauchy-Riemann 
équations are satisfied. The dérivative may be computed, for example 
by formula (22). This gives 


dw 

dz 


e *. 


On the basis of formula (17) it is easy to establish the differentiability of 
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the trigonométrie functions and the validity of the formulas known 
from analysis for the values of their dérivatives. 


The function Ln z- We will not give here an investigation of ail the 
elementary functions of a complex variable. However, it is important 
for our purposes to become acquainted with some of the properties of 
the function Ln z. As in the case of the real domain, we set 


In order to analyze the function Ln z, we write the number z in trigono¬ 
métrie form 

z = r(cos <j> + / sin <f>). 

Applying the multiplication theorem to e K , we get 

z = e w = e u * iv = e u e iv = e“(cos v / sin v). 

Equating the two expressions derived for z, we hâve 

e“ = r, (oc) 

cos v + i sïn v = cos <f> + I sin <f>. (fi) 

Since u and r are real numbers, from formula (a) we dérivé 

u = ln r, 

where ln r is the usual value of the natural logarithm of a real number. 
Equation (fi) can be satisfied only if 

cos v = cos <f>, sin v = sin <f>, 

and in this case v and <f> must differ by a number which is a multiple of 2 n 

v = <f> + 2nn, 

where for any integer n équation (fi) will be satisfied. On the basis of the 
expressions derived for u and v 

Ln z = ln r + i(<f> + 2 nn). (25) 

Formula (25) defines the function Ln z for ail values of the complex 
number z that are different from zéro. It gives the définition of the 
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logarithm not only for positive numbers but also for négative and complex 
numbers. 

The expression derived for the function Ln z contains an arbitrary 
integer n. This means that Ln z is a multiple-valued function. For any 
value of n we get one of the possible values of the function Ln z. If we 
fix the value of n, we get one of the possible values of this function. 

However, the different values of 
Ln z, as can be shown, are organically 
related to one another. In fact, let us 
fix, for example, the value n = 0 at the 
point z„ and then let z move contin- 
uously around a closed curve C, 
which surrounds the origin and returns 
to the point z 0 (figure 4). During the 
motion of z, the angle <f> will increase 
continuously and when z moves around 
the entire closed contour, <f> will in¬ 
crease by 2 n. In this manner, fixing the 
value of the logarithm at z 0 


r' 

Tf ■ 


O 




Fig. 4. 


(Ln z)o = in r 0 + ifo 


and changing this value continuously while moving z along the closed 
curve surrounding the origin, we return to the point z„ with another 
value of the function 


(Ln z)o = In r 0 + i(<f> 0 + 2 n). 

This situation shows us that we may pass continuously from one value 
of Ln z to another. For this the point need only travel around the origin 
continuously a sufficient number of times. The point z = 0 is called a 
branch point of the function Ln z. 

If we wish to restrict considération to only one value of the function 
Ln z, we must prevent the point z from describing a closed curve sur¬ 
rounding the point z = 0. This may be done by drawing a continuous 
curve from the origin to infinity and preventing the point z from Crossing 
this curve, which is called a eut. If z varies over the eut plane, then it 
never changes continuously from one value of Ln z to another and thus, 
starting from a spécifie value of logarithm at any point z„, we get at 
each point only one value of the logarithm. The values of the function 
Ln z selected in this way constitute a single-valued branch of the function 

For example, if the eut lies along the négative part of the axis Ox, 
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we get a single-valued branch of Ln z by restricting the argument to the 
limits 

(2k — \)n < (f> < ( 2 k+ 1 

where k is an arbitrary integer. 

Considering a single-valued branch of the logarithm, we can study its 
differentiability. Putting 

_ y 

r = Vat* -f y 1 , <j> = arc tan ^, 

it is easy to show that Ln z satisfies the Cauchy-Riemann conditions and 
its dérivative, calculated for example by formula (22), will be equal to 

d Ln z _ 1^ 
dz z 

We emphasize that the dérivative of Ln z is also a single-valued function. 


§2. The Connection Between Functions of a Complex Variable and 
the Problems of Mathematical Physics 

Connection with problems of hydrodynamics. The Cauchy-Riemann 
conditions relate the problems of mathematical physics to the theory 
of functions of a complex variable. Let us illustrate this from the problems 
of hydrodynamics 

Among ail possible motions of a fluid an important rôle is played by 
the steady motions. This name is given to motions of the fluid for which 
there is no change with time in the distribution of velocities in space. 
For example, an observer standing on a bridge and watching the flow 
of the river around a supporting pillar sees a steady flow. Sometimes a 
flow is steady for an observer in motion on some conveyance. In the 
case of a steamship travelling through rough water, the flow will appear 
nonsteady to an observer on the shore but steady to one on the ship. 
To a passenger seated in an airplane that is flying with constant velocity, 
the flow of the air as disturbed by the plane will still appear to be a 
steady one. 

For steady motion the velocity vector V of the particle of the fluid 
passing through a given point of space does not change with time. If 
the motion is steady for a moving observer, then the velocity vector 
does not change with time at points having constant coordinates in a 
coordinate System which moves with the observer. 

Among the motions of a fluid great importance has been attached to 
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the class of plane-parallel motions. These are flows for which the velocity 
of the particles is everywhere parallel to some plane and the distribution 
of the velocities is identical on ail planes parallel to the given plane. 

If we imagine an infinitely extended mass of fluid, flowing around a 
cylindrical body in a direction perpendicular to a generator, the distribu¬ 
tion of velocities will be the same on ail planes perpendicular to the 
generator, so that the flow will be plane-parallel. In many cases the 
motion of a fluid is approximately plane-parallel. For example, if we 
consider the flow of air in a plane perpendicular to the wing of an air- 
plane, the motion of the air may be considered as approximately plane- 
parallel, provided the plane in question is not very close either to the 
fuselage or to the tip of the wing. 

We will show how the theory of functions of a complex variable may 
be applied to the study of steady plane-parallel flow. 

Here we will assume that the liquid is incompressible, i.e., that its 
density does not change with change in pressure. This assumption holds, 
for example, for water, but it can be shown that even air may be considered 
incompressible in the study of its flow, if the velocity of the motion is 
not very large. The hypothesis of incompressibility of air will not produce 
a noticeable distortion if the velocities of motion do not exceed the range 
of 0.6 to 0.8 of the velocity of sound (330 m/sec). 

The flow of a liquid is characterized by the distribution of the velocities 
of its particles. If the flow is plane-parallel, then it is sufficient to détermine 
the velocities of the particles in one of the planes parallel to which the 
motion occurs. 

We will dénoté by V(x, y, t) the vector velocity of the particle passing 
through the point with coordinates x, y at the instant of time t. In the 
case of steady motion, V does not dépend on time. The vector V will 
be given by its projections u and v on the coordinate axes. We consider 
the trajectories of particles of the fluid. In the case of steady motion, 
there is no change with time in the velocities of the successive particles 
issuing from a given point in space. If we know the field of the velocities, 
i.e., if we know the components of the velocity as functions of x,y, 
then the trajectories of the particles may be determined by using the 
fact that the velocity of a particle is everywhere tangent to the trajectory. 
This gives 

dy = v(x, y) 
dx u(x, y) 

The équation so obtained is the differential équation for the trajectories. 
The trajectory of a particle in a steady motion is called a streamline. 
Through each point of the plane passes exactly one streamline. 
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An important rôle is played here by the so-calied stream function. 
For a fixed streamline C 0 let us consider the imaginary channel bounded 
by the following four walls: One wall is the cylindrical surface (with 
generators perpendicular to the plane of the flow) passing through the 
streamline C 0 ; the second wall is the same cylindrical surface for a 
neighboring streamline C,; the third is the plane of the flow; and the 
fourth is a parallel plane at unit distance (figure 5). If we consider two 
arbitrary cross sections of 
our channel, denoted by y, 
and y 2 , then the quantity 
of fluid passing through the 
sections y, and y 2 in unit 
time will be the same, as 
follows from the fact that 
the quantity of fluid inside 
the part of the channel 
marked off by C,, C 0 and 
y, , y 2 cannot change, be- 
cause of the constant density, 
since the side walls of the Fig. 5 

channel C 0 and C, are formed 

by streamlines, so that no fluid passes through them Consequently the 
same amount of fluid must leave in unit time through y, as enters 
through y 2 . 

Now by the stream function we mean the function <p(x, y) that has a 
constant value on the streamline C, equal to the quantity of liquid passing 
in unit time through the cross section of the channel constructed on the 
curves C 0 and C, . 

The stream function is defined only up to an arbitrary constant, 
depending on the choice of the initial streamline C 0 . If we know the 
stream function, then the équations for the streamlines are obviously 

>p(x, y) = const. 

We now wish to express the components of the velocity of the flow at a 
given point M(x, ÿ) in terms of the dérivatives of the stream function. 
To this end we consider the channel formed by the streamline C through 
the point M(x, y) and a neighboring streamline C' through a nearby 
point M\x, y + Ay ), together with the two planes parallel to the plane 
of flow and a unit distance apart. Let us compute the quantity of the 
liquid q passing through the section MM' of the channel during time dt. 

On the one hand, from the définition of the stream function 

q = W-+)dt. 
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On the other hand, q is equal (figure 6) to the volume of the solid 
formed by drawing the vector V dt from each point of the section MM'. 
If MM' is small, we may assume that V is constant over the whole of 

MM' and is equal to the 
value of V at the point M. 
The area of the base of the 
paralielepiped so constructed 
is Ay x 1 (in figure 6 the 
unit thickness is not shown), 
and the altitude is the 
projection of the vector V dt 
on the Ox axis, i.e., u dt so 
that 

q s» u A y dt 

and thus 

u Ay as Aip. 

Dividing this équation by Ay, and passing to the limit, we get 

dtp 

u = 8ÿ' 

A similar argument gives for the second component 



(26) 




dx ' 


(26') 


To define the field of the velocity vectors, we introduce, in addition to 
the stream function, another function, which arises from considering 
the rotation of small particles of the Iiquid If we imagine that a particular 
particle of the fiuid were to become solidified, it would in general hâve 
a rotatory motion. However, if the motion of the fiuid starts from rest 
and if there is no internai friction between particles, then it can be shown 
that rotation of the particles of the fiuid cannot begin. Motions of a 
fiuid in which there is no rotation of this sort are called irrotational ; 
they play a fundamental rôle in the study of the motion of bodies in a 
fiuid. In the theory of hydromechanics it is shown that for irrotational 
flow there exists a second function <p(x , y) such that the components of 
the velocity are expressed by the formulas 




(27) 


the function <f> is called the velocity potential of the flow. Later, we will 
consider motions with velocity potential. 
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Comparison of the formulas for the components of the velocity from 
the stream function and from the velocity potential gives the following 
remarkable resuit 

The velocity potential <f>(x, y) and the stream function ip(x, y) for the 
fiow of an incompressible fluid satisfy the Cauchy-Riemann équations 

d(f> dtp 

S = (28) 
ty _ _d4> 
dy dx ' 

In other words, the function of a complex variable 

h - = <K X > y) + i>P( x < y) 

is a différentiable function of a complex variable. Conversely, if we choose 
an arbitrary différentiable function of a complex variable, its real and 
imaginary parts satisfy the Cauchy-Riemann conditions and may be 
considered as the velocity potential and the stream function of the flow 
of an incompressible fluid. The function w is called the characteristic 
function of the flow 

Let us now consider the significance of the dérivative of w. Using, 
for example, formula (22), we hâve 

dw d<f> dtp 
~dz ~ dx + ' dx ' 

From (27) and (26') we find 


dw 

—r— = U — IV 
dz 

or, taking complex conjugates, 

"+'■'-(-£-)■ < 29 > 

where the bar over dw/dz dénotés the complex conjugate 
Consequently, the velocity vector of the fiow is equal to the conjugate 
of the value of the dérivative of the characteristic function of the flow. 

Examplesof plane-parallel flow of a fluid. We consider several examples. 
Let 


w = A z, 


(30) 
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where A is a complex quantity. From (29) it follows that 

u + iv — A. 

Thus the linear function (30) defines the flow of a fluid with constant 
vector velocity. If we set 

A = u 0 — iv 0 , 

then, decomposing into the real and imaginary parts of w, we hâve 

<K X > y) = UqX + v 0 y, 

<K X > y) = u 0 y - Vgx, 

so that the streamlines will be straight Unes parallel to the velocity vector 
(figure 7). 

As a second example we consider the function 

h- = Az\ 


where the constant A is real. In order to graph the flow, we first détermine 
the streamlines. In this case 

4>(x,y) = 2Axy, 

and the équations of the streamlines are 

xy = const. 

These are hyperbolas with the coordinate axes as asymptotes (figure 8). 
The arrows show the direction of motion of the particles along the stream¬ 
lines for A > 0. The axes Ox and Oy are also streamlines 

If the friction in the liquid is very small, we will not disturb the rest 
of the flow if we replace any streamline by a rigid wall, since the fluid 


iX 




Fio. 7. 


Fig. 8. 
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will glide along the wall Using this principle to construct walls along 
the positive coordinate axes (in figure 8 they are represented by heavy 
fines), we hâve a diagram of how the fluid ftows irrotationally, in this 
case around a corner. 

An important example of a flow is given by the function 

w = a(z + y), (31) 


where a and R are positive real quantities 
The stream function will be 



R'y \ 


and thus the équation for the streamlines is 


R*y 

x * + y • 


= const. 


In particular, taking the constant equal to zéro, we hâve either y = 0 or 
x 1 4- y 2 = R}\ thus, a circle of radius R is a streamline. If we replace 
the interior of this streamline by a soiid body, we obtain the flow around 
a circular cylinder A diagram of the streamlines of this flow is shown in 
figure 9. The velocity of the flow may be defined from formula (29) by 

u+iv = a[ 1 --jr)- 


At a great distance from the cylinder we find 
lim (u + iv) = û; 

*-* «O 

i.e., far from the cylinder the velocity tends to a constant value and thus 
the flow tends to be uniform. Consequently, formula (29) defines the 
flow which arises from the passage 
around a circular cylinder of a fluid 
which is in uniform motion at a 
distance from the cylinder. 

The basic ideas of the theory of an air- 
plane wmg; theorem of Zukovskii. Fig. 9. 

The application of the theory of functions of a complex variable to the 
study of plane-parallel flows of a fluid was the source of several remarkable 
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discoveries in aerodynamics by Zukovskil and Caplygin The study of 
streamlines of bodies led them to discover the law for the formation of 
lifting force on the wing on an airplane. In order to présent the ideas 
which led to this discovery, we need to consider one more concrète 
example of fluid flow Let us consider the characteristic function 



where r is a real constant. Although M' is a multiple-valued function, 
its dérivative 


dw _ r 1 ^ 

dz 2ni z 


(32) 


is single valued, so that our function uniquely defines the velocity field 
of some fluid flow. If we set z = re iB , the velocity potential and the 
stream function may be computed from (25) as 




The second of these formulas shows that the streamlines are the circles 
r = const (figure 10). 

The velocity of the flow is defined by formula (29) as 


In particular, it follows that the value of the velocity vector will be 

m 1 


V = | u + Iv | = 


2ir 


i.e., the velocity is constant on every streamline A more detailed investiga¬ 
tion shows that the flow goes counterclockwise for r > 0 and clockwise 

for r < 0 . 

If we replace one of the streamlines by 
a rigid boundary, we obtain the circular 
motion of a fluid around a cylinder. Such 
a motion is called circulatory. 

However, the potential of our motion 
is not a single-valued function. In one 
passage over a closed contour around the 
cylinder the potential is changed by an 
amount r. This change in potential is 
Fig. 10. called the circulation of the flow. 
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If to the characteristic function of a flow past a cylinder (31), we add 
the characteristic function of a circulatory flow (with clockwise circuit), 
we get a new characteristic function 



This characteristic function also represents the flow around a cylinder of 
radius R. In fact, the stream function will be constant on a circumference 
of radius R, since there the coefficients of the imaginary parts of both 
terms are constant. The velocity of the flow, defined by the function (33), 
will again converge to a as z -*■ oo. This shows that the characteristic 
function (33) defines, for any value of T, the streamlines of a translational 
flow past a cylinder Figure II illustrâtes the character of the flow for 
T > 0. This flow will not be 
symmetric, since the stagna¬ 
tion points a and b where the 
streams meet and leave the 
cylinder are displaced down- 
ward. The potential of the flow 
under considération will be a 
multiple-valued function. As 
the resuit of one circuit around 
the cylinder it will change by Fig. 11. 

an amount equal to T. 

Because of symmetry, the flow around a cylinder will usually be of the 
form defined by the functions (32), but for nonsymmetric bodies the 
flow which arises usually has a multiple-valued potential Later we will 
discuss the physical significance of this fact The methods of the theory 
of functions of a complex variable allow us to define the possible flows 
around bodies of arbitrary shape These methods will be discussed in the 
following section. With their help we can make use of the flow around 
a cylinder to construct the flow, with single-valued or multiple-valued 
potential, around any body. 

In studying the streamlines of the wing of an airplane, we are dealing 
with a body with a sharp edge at the rear. The profile of the wing of an 
airplane always narrows toward the rear If for such a profile we construct 
a flow with a single-valued potential, then the stagnation point where the 
stream leaves the wing proves not to be at the edge (figure 12a). But it 



(o) Fig 12. (b) 
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turns out that such a flow is physically impossible. (Infinité velocity, 
with conséquent infinité raréfaction of the fiuid would occur at the sharp 
edge.) The flow for which the point b faits on the edge of the wing 
(figure 12b) is the uniquely possible flow, and this flow, as a rule, will 
hâve a multiple-valued potential, i.e., will be a circulatory flow. The 
circulation fof such a flow again is defined as the change in the potential 
for a circuit of a closed contour around the wing. 

The physical realizability of a flow around the profile of a wing with a 
stream leaving the rear edge is called Caplygin's postulate. 

The remarkable discovery of Zukovskil consists of the fact that the 
existence of circulation in the flow causes a lifting force on the wing, 

in a direction perpendicular to 
the velocity a of the oncoming 
1 P-paT flow and equal in magnitude 
to the quantity 

where p is the density of the 
Fig. 13. medium and r is the circula¬ 

tion (figure 13). 

This theorem of Zukovskil about the lifting force on a wing is basic 
for ail contemporary aerodynamics. We will not give the proof here, 
merely noting that the usual proofs are based on the theory of intégrais 
of functions of a complex variable. 

The basic results in aerodynamics as established by Zukovskil and 
taplygin hâve been extensively developed by the work of Soviet scientists. 

Applications to other problems of mathematical physics. The theory 
of functions of a complex variable has found wide application not only 
in wing theory but in many other problems of hydrodynamics. 

However, the domain of application of the theory of functions is not 
restricted to hydrodynamics, it is much wider than that, including many 
other problems of mathematical physics. To illustrate, we return to the 
Cauchy-Riemann conditions 

du _ dv 
dx~dÿ’ 
du __ dv 

dy dx 

and deduce from them an équation which is satisfied by the real part of 
an analytic function of a complex variable. If the first of these équations 
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is differentiated with respect to x, and the second with respect to y, we 
obtain by addition 


ül + ÜÎL-o. 

dx* a.v* 


This équation (which we hâve already met in Chapter VI) is known as the 
Laplace équation. A large number of problems of physics and mechanics 
involve the Laplace équation. For example, if the heat in a body is in 
equilibrium, the température satisfies the Laplace équation. The study 
of magnetic or electrostatic fields is connected with this équation. In 
the investigation of the filtration of a liquid through a porous medium, 
we also arrive at the Laplace équation. In ail these problems involving 
the solution of the Laplace équation the methods of the theory of functions 
hâve found wide application. 

Not only the Laplace équation but on the more general équations of 
mathematical physics can be brought into connection with the theory of 
functions of a complex variable. One of the most remarkable examples 
is provided by planar problems in the theory of elasticity. The foundations 
of the application of functions of a complex variable to this domain 
were laid by the Soviet scientists G. B. Kolosov and N. I. MusheliSvili. 


§3. The Connection of Functions of a Complex Variable with 
Geometry 

Géométrie properties of différentiable functions. As in the case of 
functions of a real variable, a great rôle is played in the theory of analytic 
functions of a complex variable by the géométrie interprétation of these 
functions. Broadly speaking, the géométrie properties of functions of a 
complex variable hâve not only provided a natural means of visualizing 
the analytic properties of the functions but hâve also given rise to a 
spécial set of problems. The range of problems connected with the géométrie 
properties of functions has been called the géométrie theory of functions. 
As we said earlier, from the géométrie point of view a function of a 
complex variable w = f\z) is a transformation from the z-plane to the 
w-plane. This transformation may also be defined by two functions of 
two real variables 

u = u(x, y), 
v = v(x, y). 

If we wish to study the character of the transformation in a very small 
neighborhood of some point, we may expand these functions into 
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Taylor sériés and restrict ourselves to the leading terms of the expansion 


(!r). (jr - + (I7). 0 ' -*> + ■••• 


du ' 


v - v -= Æ).'* - +(•£).<>’ - +- • 


dy - 


where the dérivatives are taken at the point (x 0 , y 0 ). Thus, in the neigh- 
borhood of a point, any transformation may be considered approximately 
as an affine transformation* 


u u 0 — a(x x 0 ) + b(y — y 0 ), 
V - v 0 = c(x - x 0 ) + d(y - y 0 ), 

where 


- l du \ 

L _ 

/du\ 

UJ 0 * 

0 = 

\ b y ) ’ 

- l dv \ 

A _ 

( dv \ 


a — 

\dyl 0 - 


Let us consider the properties of the transformation effected by the 
analytic function near the point z = x 4- iy Let C be a curve issuing 
from the point z; on the w-plane the corresponding points trace out the 
curve r, issuing from the point w. If ï is a neighboring point and w‘ 
is the point corresponding to it, then for z'-*zwe will bave h>' -► k> and 


•^rr ( 34 > 

In particular, it follows that 

~~ i/'w• < 35 > 

This fact may be formulated in the following manner. 

The limit of the ratio of the lengths of corresponding chords in the 
w-plane and in the z-plane at the point z is the same for ail curves issuing 
from the given point z, or as it is also expressed, the ratio oflinear éléments 
on the w-plane and on the z-plane at a given point does not dépend on 
the curve issuing from z. 

The quantity |/'(z)|, which characterizes the magnification of linear 
éléments at the point z, is called the coefficient of dilation at the point z. 


*Cf. Chapter 111, §11. 
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We now suppose that at some point z the dérivative /'(z) ^ 0, so 
that f'(z) has a uniquely determined argument.* Let us compute this 
argument, using (34) 

arg jnzj- = arg (*' — w) — arg ( z ' - z), 

but arg(w' — w) is the angle /3' between the cbord ww' and the real axis, 
and arg(z' — z) is the angle a' between the chord zz and the real axis. 




If we dénoté by a. and /3 the corresponding angles for the tangents to the 
curves C and T at the points z and w (figure 14), then for z' -*■ z 

P-P, 


so that in the limit we get 


arg f\z) = fi - a. (36) 

This équation shows that arg f\z) is equal to the angle tf> through which 
the direction of the tangent to the curve C at the point z must be turned 
to assume the direction of the tangent to the curve r at the point w. 
From this property arg f\z) is called the rotation of the transformation 
at the point z. 

From équation (36) the reader can easily dérivé the following proposi¬ 
tions. 

As we pass from the z-plane to the w-plane, the tangents to ail curves 
issuing from a given point are rotated through the same angle. 

If C, and C 2 are two curves issuing from the point z, and Z 1 , and f 2 
are the corresponding curves from the point w, then the angle between 
T, and r, at the point w is equal to the angle between C, and C 2 at 
the point z. 


* Cf. Chapter IV, §3. 
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In this manner, for the transformation effected by an analytic function, 
at each point where f'(z) ^ 0, ail linear éléments are changed by the 
same ratio, and the angles between corresponding directions are not 
changed. 

Transformations with these properties are called conforma! transforma¬ 
tions. 

From the géométrie properties just proved for transformations near a 
point at which f'(z 0 ) 0, it is natural to expect that in a small neighbor- 

hood of z 0 the transformation will be one-to-one; i.e., not only will 
each point z correspond to only one point w, but also conversely each 
point w will correspond to only one point z. This proposition can be 
rigorously proved. 

To show more clearly how conformai transformations are distinguished 
from various other types of transformations, it is useful to consider an 
arbitrary transformation in a small neighborhood of a point. If we 
consider the leading terms of the Taylor expansions of the functions u 
and v effecting the transformation, we get 

u ~ = (-&)„ (x - *•> + (*^), ( > ~ 

* - *=(■£■). ( * - + (£). ( ' -*>+•••• 

If in a small neighborhood of the point (x„ , >> 0 ) we ignore the terms of 
higher order, then our transformation will act like an affine transforma¬ 
tion. This transformation has an inverse if its déterminant does not vanish 



If A =0, then to describe the behavior of the transformation near the 
point (x 0 , y 0 ) we must consider terms of higher order.* 

In case u +• iv is an analytic function, we can express the dérivatives 
with respect to y in terms of the dérivatives with respect to x by using 
the Cauchy-Riemann conditions, from which we get 


A = 



= l/W. 


* In this last case, i.e., for A = 0, the transformation is not called affine. For affine 
transformations see also Chapter III, §11. 
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i.e., the transformation has an inverse when f'(z 0 ) 96 0. If we set 
f'(z 0 ) = r(cos <j> + i sin </>), then 

(f ■).—(£).—** 

and the transformation near the point (* 0 , y 0 ) will hâve the form 

u - m 0 = '•[(* - * 0 ) cos <f> - (y - y 0 ) sin </.] + -, 

v —v„ = r[(x - x 0 ) sin <f> + (y — y 0 ) cos <f>} + 

Thèse formulas show that in the case of an analytic function w = u + iv, 
the transformation near the point (*„,>'„) consists of rotation through 
the angle <f> and dilation with coefficient r. In fact, the expressions inside 
the brackets are the well-known formulas from analytic geometry for 
rotation in the plane through an angle >f>, and multiplication by r gives 
the dilation. 

To form an idea of the possibilités when f\z) = 0 it is useful to consider 
the function 

w = r". (37) 

The dérivative of this function w' = nz n ~ l vanishes for 2 = 0. The 

transformation (37) is most conveniently considered by using polar 

coordinates or the trigonométrie form of a complex number. Let 

2 = r(cos <f> + / sin <f>), 
w = p {cos 0 + i sin 0). 

Using the fact that in multiplying complex numbers the moduli are 
multiplied and the arguments added, we get 

2 " = r n (cos n<f> -f- i sin n<f>), 

and thus 

P = r\ 

0 = n<f>. 

From the last formula we see that the ray <f> = const of the 2 -plane 
transforms into the ray 0 = n<j> = const in the w-plane. Thus an angle <* 
between two rays in the 2 -plane will transform into an angle of magnitude 
P = na. The transformation of the 2 -plane into the w-plane ceascs to 
be one-to-one. In fact, a given point w with modulus p and argument 0 
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may be obtained as the image of each of the n points with moduli r = <yp 
and arguments 

8 8 2n 8 2n. 

* = + -.= + -=■(» “')• 


n n 


When raised to the power n, the moduli of the corresponding points 
will ail be equal to p and their arguments will be equal to 

8,8 + 2tt, -, 8 + 2n(n - 1 ), 

and since changing the value of the argument by a multiple of 2n does 
not change the géométrie position of the point, ail the images on the 
w-plane are identical. 


Conformai transformations. If an analytic function w = /[z) takes a 
domain D of the z-plane into a domain A of the w-plane in a one-to-one 
manner, then we say that it effects a conformai transformation of the 
domain D into the domain A. 

The great rôle of conformai transformations in the theory of functions 
and its applications is due to the following almost trivial theorem. 

If £ = F(w) is an analytic function on the domain A, then the composite 
function F[/(z)] is an analytic function on the domain D. This theorem 
results from the équation 

AÇ = AZ Aw 
Az Aw Az 

In view of the fact that the functions £ = F(w) and h» = /(z) are 
analytic, we conclude that both factors on the right side hâve a limit, 
and thus at each point of the domain D the quotient A£/Az has a unique 
limit dÇ/dz. This shows that the function £ = F\f(z)\ is analytic. 

The theorem just proved shows that the study of analytic functions 
on the domain A may be reduced to the study of analytic functions on 

the domain D. If the géo¬ 
métrie structure of the do¬ 
main D is simpler, this fact 
simplifies the study of the 
functions. 

The most important class 
of domains in which it is 
necessary to study analytic 
functions is the class of 
simply connecied domains. 
This is the name given to do¬ 
mains whose boundary con- 
sists of one piece (figure 15a) 
asopposed to domains whose 
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boundary faits into several pièces (for example, the domains illustrated 
in figures 15b and 15c). 

We note that sometimes we are interested in investigating functions 
on a domain lying outside a curve rather than inside it. If the boun¬ 
dary of such a domain consists of only one piece, then the domain is also 
called simply connected (figure 15d). 

At the foundations of the theory of conformai transformations lies the 
following remarkable theorem of Riemann. 

For an arbitrary simply connected domain J, it is possible to construct 
an analytic function which effects a conformai transformation of the 
circle with unit radius and center at the origin into the given domain in 
such a way that the center of the circle is transformed into a given point h > 0 
of the domain A, and a curve in an arbitrary direction at the center of 
the circle transforms into a curve with an arbitrary direction at the point 
This theorem shows that the study of functions of a complex variable 
on arbitrary simply connected domains may be reduced to the study of 
functions defined, for example, on the unit circle. 

We will now explain in general outline how these facts may be applied 
to problems in the theory of the wing of an airplane. Let us suppose that 
we wish to study the flow around a curved profile of arbitrary shape. 

If we can construct a conformai transformation of the domain outside 
the profile to the domain outside the unit circle, then we can make use 
of the characteristic function for the flow around the circle to construct 
the characteristic function for the flow around the profile. 

Let £ be the plane of the circle, z the plane of the profile, and £ = /(z) 
a function eflecting the transformation of the domain outside the profile 
to the domain outside the circle, where 

We dénoté by a the point of the 
circle corresponding to the edge of 
the profile A and construct the cir- 
culatory flow past the circle with one 
of the streamlines leaving the circle 
at a (figure 16). This function will be 
denoted by 1F(£): 

1F(£) = <D + i'E. 


Fig. 16. 



lim £ = oo. 
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The streamlines of this flow are defined by the équation 

'F = const. 


We now consider the function 

h< z) = W[f(z)], 


and set 


w = <f> + i<l>- 


We show that w(z) is the characteristic function of the flow past the 
profile with a streamline leaving the profile at the point A. First of ail 
the flow defined by the function w(z) is actually a flow past the profile. 
To prove this, we must show that the contour of the profile is a stream¬ 
line curve, i.e., that on the contour of the profile 

•Kx, y) = const. 

But this follows from the fact that 

>Kx,y)= m,v), 

and the points ( x , y) lying on the profile correspond to the points (f, rj) 
lying on the circle, where r/) = const. 

It is also simple to show that A is a stagnation point for the flow, 
and it may be proved that by suitable choice of velocity for the flow 
past the circle, we may obtain a flow past the profile with any desired 
velocity. 

The important rôle played by conformai transformations in the theory 
of functions and their applications gave rise to many problems of finding 
the conformai transformation of one domain into another of a given 
géométrie form. In a sériés of simple but useful cases this problem may 
be solved by means of elementary functions. But in the general case 
the elementary functions are not enough. As we saw earlier, the general 
theorem in the theory of conformai transformations was stated by Riemann, 
although he did not give a rigorous proof. In fact, a complété proof 
required the efforts of many great mathematicians over a period of 
several décades. 

In close connection with the different approaches to the proof of 
Riemann’s theorem came approximation methods for the general construc¬ 
tion of conformai transformations of domains. The actual construction 
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of the conformai transformation of one domain onto another is sometimes 
a very difficult problem. For investigation of many 
of the general properties of functions, it is often 
not necessary to know the actual transformation of 
one domain onto another, but it is sufficient to 
exploit sôme of its géométrie properties. This fact 
has led to a wide study of the géométrie properties 
of conformai transformations. To illustrate the 
nature of theorems of this sort we will formulate Fig. 17. 
one of them. 

Let the circle of unit radius on the z-plane with center at the origin 
be transformed into some domain (figure 17). If we consider a completely 
arbitrary transformation of the circle into the domain A, we cannot 
make any statements about its behavior at the point z = 0. But 
for conformai transformations we hâve the following remarkable 
theorem. 

The dilation at the origin does not exceed four times the radius of the 
circle with center at w 0 , inscribed in the domain 

l/'(0) | ^ 4r. 

Various questions in the theory of conformai transformations were 
considered in a large number of studies by Soviet mathematicians. In 
these works exact formulas were derived for many interesting classes of 
conformai transformations, methods for approximate calculation of 
conformai transformations were developed, and many general géométrie 
theorems on conformai transformations were established. 



Quasi-conformal transformations. Conformai transformations are closely 
connected with the investigation of analytic functions, i.e., with the study 
of a pair of functions satisfying the Cauchy-Riemann conditions 

du _ dv 
dx~ dv' 

du _ dv 
dy dx ’ 

But many problems in mathematical physics involve more general Systems 
of differential équations, which may also be connected with transforma¬ 
tions from one plane to another, and these transformations will hâve 
spécifie géométrie properties in the neighborhood of points in the Oxy 
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plane. To illustrate, we consider the following example of différenciai 
équations 


du 

dx 


, .dv 

= p(x>y)y y > 


dv 

d~x 


, . du 

-p(x>y)y y - 


(38) 


If p(x, y) = 1, this is the System of Cauchy-Riemann équations. In 
the general case of an arbitrary function p(x,y), we can also consider 
every solution of the System (38) as a transformation of the Oxy plane 
to the Ouv plane. Let us examine the géométrie properties of this trans¬ 
formation in the neighborhood of a point (x 0 , _y 0 ). Taking a small neigh- 
borhood of (x 0 , >>„), we retain only the first terms in the expansion of 
the functions u and v in powers of x — x 0 and y — y 0 , and thereby 
consider the following affine transformation 


“ " “• = (l9 0 {x ~ Xo) + (|f) 0 ~ *>’ (39) 

v - v *=(£V x - x * )+ (wV y - ya) - 

If the functions u and v satisfy the System of équations (38), then for 
this affine transformation we hâve the following property. 

Ellipses with center at the point (jco.^o) with principal axes parallel to 
the coordinate axes, and with ratio of semiaxes 


b 

a 


P(x o. .Ho) 


are transformed in the Ouv plane to circles with center at the point 
(“o. v„). 

Let us prove this proposition. The équation of the circle with center 
(“o > v o) * n ^e Ouv plane will be 

(u - u,) 2 + (v- v 0 ) 2 = p 2 . 

Replacing u — u 0 and v — v„ by their expressions in terms of x and y, 
we get the équation for the corresponding curve in the Oxy plane: 
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Using the équations in (38) to express the dérivatives of v in terms 
of the dérivatives of u, we get 





(y-y 0 f = P*. 


If we set 


a 

b 


væ)>©:' 


this équation takes the form 

(£ - to ) 2 | (y - y*)' _ . 

a* ^ f>* 

Thus the curve that is transformed into a circle is in fact an ellipse with 
the indicated properties. 

If we do not consider the affine transformation given by the first terms 
of the expansion but rather the exact transformation itself, then the above 
property of the transformation will hold more and more exactly for 
smaller and smaller ellipses, so that we may say that the property holds 
for infinitely small ellipses. 

In this manner, from équations (38) it follows that at every point the 
infinitésimal ellipse that is transformed into a circle has its semiaxes 
completely determined by the transformation, both with respect to their 
direction and to the ratio of their lengths. It can be shown that this 
géométrie property completely characterizes the System of differential 
équations (38); i.e., if the functions u and v effect a transformation with 
the given géométrie property, then they satisfy this System of équations. 
In this way, the problem of investigating the solutions of équations (38) 
is équivalent to investigating transformations with the given properties. 

We note, in particular, that for the Cauchy-Riemann équations this 
property is formulated in the foliowing manner. 

An infinitésimal circle with center at the point (jc 0 , y 0 ) is transformed 
into an infinitésimal circle with center at the point (u 0 , t» 0 ). 

A very wide class of équations of mathematical physics may be reduced 
to the study of transformations with the foliowing géométrie properties. 

For each point (x, y) of the argument plane, we are given the direction 
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of the semiaxes of two ellipses and also the ratio of the lengths of these 
semiaxes. We wish to construct a transformation of the Oxy plane to 
the Ouv plane such that an infinitésimal ellipse of the first family transforms 
into an infinitésimal ellipse of the second with center at the point (u, v). 

Transformations connected with such general Systems of équations 
were introduced by the Soviet mathematician M. A. Lavrent’ev and hâve 
received the name quasi-conformal. The idea of studying transformations 
defined by Systems of differential équations made it possible to extend 
the methods of the theory of analytic functions to a very wide class of 
problems. Lavrent’ev and his students developed the study of quasi- 
conformal transformations and found a large number of applications 
to various problems of mathematical physics, mechanics, and geometry. 
It is interesting to note that the study of quasi-conformal transformations 
has proved very fruitful in the theory of analytic functions itself. Of 
course, we cannot dwell here on ali the various applications of the géo¬ 
métrie method in the theory of functions of a complex variable. 


§4. The Line Intégral; Cauchy’s Formula and Its Corollaries 




Intégrais of functions of a complex variable. In the study of the 
properties of analytic functions the concept of a complex variable plays 
a very important rôle. Corresponding to the 
definite intégral of a function ofa real variable, 
we here deal with the intégral of a function of 
a complex variable along a curve. We consider 
in the plane a curve C beginning at the point z 0 
and ending at the point z, and a function f(z) 
defined on a domain containing the curve C. 
We divide the curve C into small segments 
(figure 18) at the points 



Fig. 18. 


z 0 > z l • 2 n — * 


and consider the sum 


n 


s = £/(**) (**-**-.)• 

Jr=l 


If the function f(z) is continuous and the curve C has finite length, 
we can prove, just as for real functions, that as the number of points 
of division is increased and the distance between neighboring points 
decreases to zéro, the sum 5 approaches a completely determined limit. 
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This limit is called the intégral along the curve C and is denoted by 

J f(z)dz. 

J c 

We note that in this définition of the intégral we hâve distinguished 
between the beginning and the end of the curve C; in other words, we 
hâve chosen a spécifie direction of motion on the curve C. 

It is easy to prove a number of simple properties of the intégral. 

1. The intégral of the sum of two functions is equal to the sum of the 
intégrais of the individual functions: 

f l/(z) + *(*)] dz = f Rz) dz + f g(z) dz. 

J c J c J c 

2. A constant multiple may be taken outside the intégral sign: 

f df(z) dz = A f f(z) dz. 

C c 

3. If the curve C is the sum of the curves C, and C* , then 

f f(z) dz= f f(z) dz + j Rz) dz. 

d c •'Cl J C • 


4. If C is the curve C with opposite orientation, then 

f /(r) dz = - J f(z) dz. 

J C J c 

Ail these properties are obvious for the approximating sums and carry 
over to the intégral in passing to the limit. 

5. If the length of the curve C is equal to L and if everywhere on C 
the inequality 

\m\ ^ m 

is satisfied, then 

ml. 

Let us prove this property. It is sufficient to prove the inequality for 
the sum S, since then it will carry over in the limit for the intégral also. 
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For the sum 

\S\ = \ 2/(*»)(*» - z*-,)| <2 iK A*21 i • 

But the sum in the second factor is equal to the sum of the lengths of 
the segments of the broken line inscribed in the curve C with vertices 
at the points z k . The length of the broken line, as is well known, is not 
greater than the length of the curve, so that 

| 51 sg ML. 

We consider the intégral of the simplest function f(z ) — 1. Obviously 
in this case 

S = (z, - z 0 ) + (z 2 - z,) + - + (z„ - z n _,) = z n - z 0 = z - z„. 
This proves that 

1 ■ dz = z — z 0 . 
c 

This resuit shows that for the function /(z) = 1 the value of the intégral 
for ail curves joining the points z 0 and z is the same. In other words, 
the value of the intégral dépends only on the beginning and end points 
of the path of intégration. But it is easy to show that this property does 
not hold for arbitrary functions of a complex variable. For example, 
if /(z) = x, then a simple computation shows that 

t x * 

xdz = —, z = x + iy, 

J c, z 

of intégration shown in figure 19. 

We leave it to the reader to verify 
these équations. 

A remarkable fact in the theory of 
analytic functions is the following 
theorem of Cauchy. 

If /(z) is différentiable at every point 
of a simply connected domain D, then 
the intégrais over ail paths joining two 
arbitrary points of the domain z 0 and z 
are the same. 

We will not give a proof of Cauchy’s 
theorem here, but refer the interested reader to any course in the theory 


xdz = -j- + iyx, 
where C, and C 2 are the paths 
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of functions of a complex variable. Let us mention here some important 
conséquences of this theorem. 

First of ail, Cauchy’s theorem aliows us to introduce the indefinite 
intégral of an analytic function. For let us fix the point z 0 and consider 
the intégral along curves connecting z 0 and z: 

F(z)= f AO di. 


Here we may take the intégral over any curve joining z 0 and z, since 
changing the curve does not change the value of the intégral, which thus 
dépends only on z. The function F(z) is called an indefinite intégral of /(z). 

An indefinite intégral of /(z) has a dérivative equal to f\z). 

In many applications it is convenient to hâve a slightly different for¬ 
mulation of Cauchy’s theorem, as follows. 

If /(z) is everywhere différentiable in a simply connected domain, then 
the intégral over any closed contour lying in this domain is equal to zéro: 

f /(z) dz —- 0. 

J r 

This is obvious since a closed contour has the same beginning and end, 
so that z 0 and z may be joined by a null path. 

By a closed contour we will understand a contour traversed in the 
counterclockwise direction. If the contour is traversed in the clockwise 
direction we will dénoté it by P. 

The Cauchy intégral. On the basis of the last theorem we can prove 
the following fundamental formula of Cauchy that expresses the value 
of a différentiable function at interior points of a closed contour in 
terms of the values of the function on the contour itself 



t-z' 


We give a proof of this formula. Let z be fixed and £ be an independent 
variable. The function 


m = 


AC) 

t-z 


will be continuous and différentiable at every point £ inside the domain D, 
with the exception of the point £ = z, where the denominator vanishes, 
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a circumstance that prevents the application of Cauchy’s theorem to the 
function ■/^£) on the contour C. 

We consider a circle K„ with center at the point z and radius p and 
show that 

| 4(Ç>dl = l mdl (40) 

C K p 

To this end we construct the auxiliary closed contour r„, consisting of 

the contour C, the path y p connecting 
C with the circle, and the circle K p , 
taken with the opposite orientation 
(figure 20). The contour r p is 
indicated by arrows. Since the point 
£ = z is excluded, the function <£(£) 
is différentiable everywhere inside r p 
and thus 

I tfM-0. (41) 

Jr P 

But the contour r p is divided into four parts: C, y p , R. p and ÿ p , so that 
from property 3 in the last subsection, we hâve 



f #£)</£=[ #£)</£ + / <M)dt + mdi + f. m<K = 0. 

Jr p Je Jy p J K P Jy p 

Replacing the intégrais along R. p and ÿ p by intégrais along K p and y p , 
and using property 4, we get 


J #£)</£ = J <KÇ)dt-j = 0, 

rp c k p 


which proves formula (40). 

To compute the right side of (40), we set 


f 

K p J K p 


Æl 

i-z 



Al)-A*) 

K p l ~ Z 



A*)dl 

l-t 


f Al) -A*) 

K p l-z 


dl+A*)\ 


dl 

l-z' 


(42) 



§4. CAUCHY'S FORMULA AND ITS COROLLARIES 


179 


We compute the second term first. On the circle K „, 

£ = z + p( cos 0 + » sin 9). 

Using the fact that z and p are constant, we get 

t/£ = p(— sin 9 + i cos 9) d9 = ip(cos 9 + / sin 0) d9. 


and thus 

so that 


£ — z = p(cos 0 + » sin 0), 



since for a circuit of the circumference the total change in 0 is equal to 
2n. From (40) and (42) we hâve 


f /(CMS 
Je C-* 


2nif(z)+j M) S^ dt- 
J K f 4—2 


In this équation let us take limits as p -*■ 0. The left side and the first 
term of the right side will remain unchanged. We will show that the limit 
of the second term is equal to zéro. Then for p -*■ 0 our équation gives 
us Cauchy’s formula. In order to prove that the second term tends to 
zéro as p -*■ 0 we note that 


lim 


/(O ~/(2) 

C — 2 


= /'(C). 


i.e., the expression under the intégral sign has a finite limit, and thus is 
bounded 


AO ~/(2) 
C 2 


< M. 


Applying property 5 of the intégral, we hâve 


i/,5^ 


Sj M2np 


0. 


This complétés the proof of Cauchy’s formula. Cauchy’s formula is one 
of the basic tools of investigation in the theory of functions of a complex 
variable. 
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Expansion of différentiable functions in a power sériés. We apply 
Caucby’s theorem to establish two basic properties of différentiable 
functions of a complex variable. 

Every function of a complex variable that has a first dérivative in a 
domain D has dérivatives of ail orders. 

In fact, inside a closed contour our function may be expressed by the 
Cauchy intégral formula 


The function of z under the sign of intégration is a différentiable function; 
thus, differentiating under the intégral sign, we get 



AO 

a-zy 


dl 


Under the intégral sign there is again a différentiable function; thus we 
can again differentiate, obtaining 


/'(*) = 


1 2 r AOdC 

2ni J c (£ -z)»' 


Continuing the différentiation, we get the general formula 



AQdj 
a - z) n+i ‘ 


In this manner we may compute the dérivative of any order. To make 
this proof completely rigorous, we need also to show that the différentia¬ 
tion under the intégral sign is valid. We will not give this part of the proof. 

The second property is the foilowing: 

Iff(z) is everywhere différentiable on a circle K with center at the point a, 
then f(z) can be expanded in a Taylor sériés 


Az) =Aa) + ^-(z - a) + - ~ fl) n+1 + - , 


which converges inside K. 

In §1 we defined analytic functions of a complex variable as functions 
that can be expanded in power sériés. This last theorem says that every 
différentiable function of a complex variable is an analytic function. 
This is a spécial property of functions of a complex variable that has 
no analogue in the real domain. A function of a real variable that has 
a first dérivative may fail to hâve a second dérivative at every point. 
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We prove the theorem formulated in the previous paragraphs. 

Let /(z) hâve a dérivative inside and on the boundary of the circle K 
with center at the point a. Then inside K the function/(z) can be expressed 
by the Cauchy intégral 


We write 


i f mx 




î - * = (î - a) - (z - «). 


(43) 


then 


1 = 1 1 1 
t-z (« - a) - (z - a) « - a . _ ^ - fl ' 

«-fl 


(44) 


Using the fact that the point z lies inside the circle, and £ is on the cir- 
cumference we get 


so that from the basic formula for a géométrie progression 


1 


1 - 


z — fl 
« - fl 



(45) 


and the sériés on the right converges. Using (44) and (45), we can represent 
formula (43) in the form 




+ (Z — fl)" 


âo 


(« - fl)" + > 




We now apply term-by-term intégration to the sériés inside the brackets. 
(The validity of this operation can be established rigorously.) Removing 
the factor (z — a)", which does not dépend on £, from the intégral sign 
in each term, we get 



« - fl 


, z-a r JlQdj 
2 ni J c («-fl) 2 


, (z - a)" r /(«) dj 
2ni J c (£-fl)"+* 
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Now using the intégral formulas for the sequence of dérivatives, we 
may write 

J_ r M)dj f ln} (a) 

2ni J c (£ — a) n+t ni 

so that we get 

/O) =/(fl) + £&L( z - a ) + - + £¥4(1 - a)" + - . 

i! ni 

We hâve shown that différentiable functions of a complex variable 
can be expanded in power sériés. Conversely, functions represented by 
power sériés are différentiable. Their dérivatives may be found by term- 
by-term différentiation of the sériés. (The validity of this operation can 
be established rigorously.) 

Entire functions. A power sériés gives an analytic représentation of 
a function only in some circle. This circle has a radius equal to the distance 
to the nearest point at which the function ceases to be analytic, i.e., to 
the nearest singular point of the function. 

Among analytic functions it is natural to single out the class of functions 
that are analytic for ail finite values of their argument. Such functions 
are represented by power sériés, converging for ail values of the argument z, 
and are called entire functions of z. If we consider expansions about the 
origin, then an entire function will be expressed by a sériés of the form 

G(z) = c 0 + c x z + c t z* + ••• + c n z n + 

If in this sériés ail the coefficients, from a certain one on, are equal to 
zéro, the function is simply a polynomial, or an entire rational function 

P(z) = c 0 + c,z H-+ c„z n . 

If in the expansion there are infiniteiy many terms that are different from 
zéro, then the entire function is called transcendental. 

Examples of such functions are: 

e * = 1 + JT + -Ir + * 

z z 3 z' 3 

sin 2 ~ 1! 3! + 5! ’ 

1 Z* , z* 

COSZ= 1 
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In the study of properties of polynomials, an important rôle is played 
by the distribution of the roots of the équation 

P(z) = 0, 

or, more generally speaking, we may raise the question of the distribution 
of the points for which the polynomial has a given value A 

P(z) = A. 

The fundamental theorem of algebra says that every polynomial takes 
a given value A in at least one point. This property cannot be extended 
to an arbitrary entire function. For example, the function w = e z does 
not take the value zéro at any point of the z-plane. However, we do hâve 
the following theorem of Picard: Every entire function assumes every 
arbitrarily preassigned value an infinité number of times, with the possible 
exception of one value. 

The distribution of the points of the plane at which an entire function 
takes on a given value A is one of the central questions in the theory 
of entire functions. 

The number of roots of a polynomial is equal to its degree. The degree 
of a polynomial is closely related to the rapidity of growth of | P(z)| as 
| r | -*■ oo. In fact, we can write 

I P(ï)\ = I * I" ■ | a» + + *pr |, 

and since for | z | -*■ oo, the second factor tends to | a n |, a polynomial 
of degree n, for large values of | z |, grows like | a„ | • | z |". So it is clear 
that for larger values of n, the growth of | P n (z)\ for | z \ -*■ oo will be 
faster and also the polynomial will hâve more roots. It turns out that 
this principle is also valid for entire functions. However, for an entire 
function /(z), generally speaking, there are infinitely many roots, and 
thus the question of the number of roots has no meaning. Nevertheless, 
we can consider the number of roots n(r, a) of the équation 

/(z) = a 

in a circle of radius r, and investigate how this number changes with 
increasing r. The rate of growth of n(r, a) proves to be connected with 
the rate of growth of the maximum M(r) of the modulus of the entire 
function on the circle of radius r. As stated earlier, for an entire function 
there may exist one exceptional value of a for which the équation may 
not hâve even one root. For ail other values of a, the rate of growth 
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of the number n(r, à) is comparable to the rate of growth of the quantity 
ln M(r). We cannot give more exact formulations here for these laws. 

The properties of the distribution of the roots of entire functions are 
connected with problems in the theory of numbers and hâve enabled 
mathematicians to establish many important properties of the Riemann 
zêta functions,* on the basis of which it is possible to prove many theorems 
about prime numbers. 


Fractional or meromorphic functions. The class of entire functions 
may be considered as an extension of the class of algebraic polynomials. 
From the polynomials we may dérivé the wider class of rational functions 


m = 


m 

m’ 


which are the quotients of two polynomials. 

Similarly it is natural to form a new class of functions by means of 
entire functions. A function f(z) which is the quotient of two entire 
functions G,(z) and G^z) 


A*) 


Cifr) 

G&) 


is called a fradional or meromorphic function. The class of functions 
arising in this way plays a large rôle in mathematical analysis. Among the 
elementary functions contained in the class of meromorphic functions 
are, for example: 


tan z = 


sin z 
cos z ’ 


cot z = 


co s z 
sin z 


A meromorphic function will not be analytic on the whole complex 
plane. At those points where the denominator G t (z) vanishes, the function 
J\z) becomes infinité. The roots of C 2 (z) form a set of isolated points 
in the plane. In neighborhoods of these points, the function /(z) naturally 
cannot be expanded in a Taylor sériés; in a neighborhood of such a 
point a, however, a meromorphic function may be represented by a 
power sériés that also contains a certain number of négative powers 
of (z — o): 

A* ) = r z z^ + - + ~~ a + C» + + - + C„(z - fl) n + -. 

(46) 


Cf. Chapter X on the theory of numbers. 
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As z approaches the point a, the value of f(z) tends to infinity. An 
isolated singular point at which an analytic function goes to infinity is 
called a pôle. The loss of analyticity of the function at the point a cornes 
from the terms with négative powers of z — a in the expansion (46). 
The expression 

- C ^- - + •■■+ - g- L 

(z — a) m (z — a) 

characterizes the behavior of a meromorphic function near a singular 
point and is called the principal pari of the expansion (46). The behavior 
of a meromorphic function is determined by its principal part in a neigh- 
borhood of a pôle. In many cases, if we know the principal part of the 
expansion of a meromorphic function in the neighborhood of ail its 
pôles, we may construct the function. Thus, for example, if f(z) is rational 
and vanishes at infinity, then it is equal to the sum of the principal parts 
of its expansions about ail of its pôles, the number of which, for a rational 
function, is finite: 



In the general case a rational function may be represented as the sum 
of ail of its principal parts and a polynomial 

C<w r <*i 

A*) = X 7 ~ m \, + ■• + '~ zî —1 + C. + C,z + - + C m z m . 

1 (z - °») * z — ait i 

(47) 

Formula (47) gives an expression for a rational function in which the 
rôle played by its singular points is clear. Expression (47) for a rational 
function is very convenient for various applications of rational functions 
and also has great theoretical interest as showing how the singular points 
of the function define its structure everywhere. It turns out that, just as 
in the case of a rational function, every meromorphic function may be 
constructed from the principal parts of its pôles. We introduce without 
proof the appropriate expression, for example, for the function cot z. 
The pôles of the function cot z are obtained as the roots of the équation 

sin z = 0 

and are situated at the points: •••, — kn, —, —n,0,n, —, kn, —. It may 
be shown that the principal part of the expansion of the function cot z 
in a power sériés at the pôle z = kn will be 

1 

z — kn ’ 
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and the function cotz is equal to the sum of the principal parts with 
respect to ail pôles 

cot z = - + y, (-;-1-—). (48) 

z \ z — kn z + kir / 

The expansion of a meromorphic function in a sériés of the principal 
parts is noteworthy in that it clearly shows the position of ail the singular 
points and also allows us to compute the function on the whole of its 
domain of définition. 

The theory of meromorphic functions has become fundamental for 
the study of many classes of functions that are of great importance in 
analysis. In particular, we must emphasize its significance for the équations 
of mathematical physics. The création of the theory of intégral équations, 
providing answers to many important questions in the theory of the 
équations of mathematical physics, was based to a great extern on the 
fundamental theorems for meromorphic functions. 

Since that rime the development of that part of functional analysis 
which is most closely connected with mathematical physics, namely the 
theory of operators, has very often depended on facts from the theory 
of analytic functions. 

On analytic représentation of functions. We saw previously that in a 
neighborhood of every point where a function is différentiable it may be 
defined by a power sériés. For an entire function the power sériés converges 
on the whole plane and gives an analytic expression for the function 
wherever it is defined. In case the function is not entire, the Taylor sériés, 
as we know, converges only in a circle whose circumference passes through 
the nearest singular point of the function. Consequently the power 
sériés does not allow us to compute the function everywhere, and so it 
may happen that an analytic function cannot be given by a power sériés 
on its whole domain of définition. For a meromorphic function an analytic 
expression giving the function on its whole domain of définition is the 
expansion in principal parts. 

If a function is not entire but is defined in some circle or if we hâve 
a function defined in some domain but we want to study it only in a 
circle, then the Taylor sériés may serve to represent it. But when we 
study the function in domains that are different from circles, there arises 
the question of finding an analytic expression for the function suitable 
for represenring it on the whole domain. A power sériés giving an expres¬ 
sion for an analytic function in a circle has as its terms the simplest 
polynomials a n z n . It is natural to ask whether we can expand an analytic 
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function in an arbitrary domain in a more general sériés of polynomials. 
Then every term of the sériés can again be computed by arithmetic 
operations, and we obtain a method for representing functions that is 
once more based on the simplest operations of arithmetic. The general 
answer to this question is given by the following theorem. 

An analytic function, given on an arbitrary domain, the boundary of 
which consists of one curve, may be expanded in a sériés of polynomials 

f(z) = P 0 (z) + P t (z) + - + P„(z) + 

The theorem formulated gives only a general answer to the question 
of expanding a function in a sériés of polynomials in an arbitrary domain 
but does not yet allow us to construct the sériés for a given function, 
as was done earlier in the case of the Taylor sériés. This theorem raises 
rather then solves the question of expanding functions in a sériés of 
polynomials. Questions of the construction of the sériés of polynomials, 
given the function or some of its properties, questions of the construction 
of more rapidlyconverging sériés or of sériés closely related to the behavior 
of the function itself, questions of the structure of a function defïned 
by a given sériés of polynomials, ail these questions represent an extensive 
development of the theory of approximation of functions by sériés of 
polynomials. In the création of this theory a large rôle has been played 
by Soviet mathemalicians, who hâve derived a sériés of fundamental 
results. 


§5. Uniqueness Properties and Analytic Continuation 

Uniqueness properties of analytic functions. One of the most remarkable 
properties of analytic functions is their uniqueness, as expressed in the 
following theorem. 

If in the domain D two analytic functions are given that agréé on some 
curve C lying inside the domain , then the y agréé on the en tire domain. 

The proof of this theorem is very simple. Let /,(z) and / 2 (z) be the two 
functions analytic in the domain D and agreeing on the curve C. The 
différence 

<f>(z) =/,(z) -/ 2 (z) 

will be an analytic function on the domain D and will vanish on the curve C. 
We now show that <f>(z) = 0 at every point of the domain D. In fact, 
if in the domain D there exists a point z 0 (figure 21) at which $z 0 ) ^ 0, 
we extend the curve C to the point z 0 and proceed along the extended 
curve toward z 0 as long as the function remains equal to zéro on T. 
Let £ be the last point of T that is accessible in this way. If f(z 0 ) yt- 0, 
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then £ z 0 , and on a segment of the curve Tbeyond £ the function <£(z), 
by the définition of the point £, will not be equal to zéro. We show that 
this is impossible. In fact, on the part T c of the curve r up to the point £, 
we hâve <f>(z) = 0. We may compute ail dérivatives of the function <f>(z) 
on r c using only the values of <f>(z) on T c , so that on 7\ ail dérivatives 
of <f>(z) are equal to zéro. In particular, at the point £ 

m = = - = <t> M a) = ••• = o. 

Let us expand the function <£(£) in a Taylor sériés at the point £. Ail 
the coefficients of the expansion vanish, so that we get 

4 >(z) = o 



Fig. 21. 


in some circle with center at the point £, lying in the domain D. In 
particular, it follows that the équation <f>(z) = 0 
must be satisfied on some segment of the curve r 
lying beyond £. The assumption <f>(z„) 0 gives 

us a contradiction. 

This theorem shows that if we know the 
values of an analytic function on some segment 
of a curve or on some part of a domain, then 
the values of the function are uniquely deter- 
mined everywhere in the given domain. Consequently, the values of an 
analytic function in various parts of the argument plane are closely 
connected with one another. 

To realize the significance of this uniqueness property of an analytic 
function, it is only necessary to recall that the general définition of a 
function of a complex variable allows any law of correspondent between 
values of the argument and values of the function. With such a définition 
there can, of course, be no question of determining the values of a function 
at any point by its values in another part of the plane. We see that the 
single requirement of differentiability of a function of a complex variable 
is so strong that it détermines the connection between values of the 
function at different places. 

We also emphasize that in the theory of functions of a real variable 
the differentiability of a function does not in itself lead to any similar 
conséquences. In fact, we may construct examples of functions that are 
infinitely often différentiable and agréé on some part of the Ox axis 
but differ elsewhere. For example, a function equal to zéro for ail négative 
values of x may be defined in such a manner that for positive x is differs 
from zéro and has continuous dérivatives of every order. For this it is 
sufficienl, for example, to set, for x > 0 

f(x) = 



§5. PROPERTIES AND ANALYTIC CONTINUATION 


189 


Analytic continuation and complété analytic functions. The domain of 
définition of a given function of a complex variable is often restricted 
by the very manner of defining the function. Consider a very elementary 
example. Let the function be given by the sériés 

/(z) = 1 + z + z* + - + z" + (49) 

This sériés, as is well known, converges in the unit circle and diverges 
outside this circle. Thus the analytic function given by formula (49) is 
defined only in this circle. On the other hand, we know that the sum of 
the sériés (49) in the circle | z | < 1 is expressed by the formula 

= <») 

Formula (50) has meaning for ail values of z ^ 1. From the uniqueness 
theorem it follows that expression (50) represents the unique analytic 
function, agreeing with the sum of the sériés (49) in the circle | z | < 1. 
So this function, given at first only in the unit circle, has been extended 
to the whole plane. 

If we hâve a function /(z) defined inside some domain D, and there 
exists another function F(z) defined in a domain A, containing D, and 
agreeing with /(z) in D, then from the uniqueness theorem the value of 
F(z) in A is defined in a unique manner. 

The function F(z) is called the analytic continuation of /(z). An analytic 
function is called complété if it cannot be continued analytically beyond 
the domain on which it is already defined. For example, an entire function, 
defined for the whole plane, is a complété function. A meromorphic 
function is also a complété function; it is defined everywhere except at 
its pôles. However there exists analytic functions whose entire domain 
of définition is a bounded domain. We will not give these more complicated 
examples. 

The concept of a complété analytic function leads to the necessity of 
considering multiple-valued functions of a complex variable. We show 
this by the example of the function 

Ln z = In r + ûf>, 

where r = | z | and 4> = arg z. If at some point z 0 = r 0 (cos (/>„ + /' sin <f>„) 
of the z-plane we consider some initial value of the function 

(Ln z)o = ln r 0 + i<f> 0 , 

then our analytic function may be extended continuously along a curve C. 
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As was mentioned earlier, it is easy to see that if the point z describes 

a closed paih C 0 , issuing front the point r„ 
and circiing around the origin (figure 22), 
and then returning to the point z 0 , we find 
at the point z 0 the original value of ln r 0 but 
the angle <f> is increased by 2ir. This shows 
that if we extend the function Ln z in a 
continuous manner along the path C, we 
increase its value by 2 ni in one circuit of the 
contour C. If the point z moves along this closed contour n times, then 
in place of the original value 



(Ln z) 0 = ln r 0 + 


we obtain the new value 


(Ln z\, = ln r 0 + (2 nn + <j> 0 )i. 

If the point z describes the contour m times in the opposite direction, 
we get 

(Ln z)_ m = ln r„ + (-2 nm + 

These remarks show that on the complex plane we are unavoidably 
compelled to consider the connection between the various values of Ln z. 
The function Ln z has infinitely many values. With respect to its multiple- 
valued character, a spécial rôle is played by the point z = 0, around 
which we pass from one value of the function to another. It is easy to 
establish that if z describes a closed contour not surrounding the origin, 
the value of Ln z is not changed. The point z = 0 is called a branch 
point of the function Ln z. 

In general, if for a function /(z), in a circuit around the point a , we 
pass from one of its values to another, then the point a is called a branch 
point of the function /(z). 

Let us consider a second example. Let 


w = i Vz. 


As noted previously, this function is also multiple-valued and takes on 
n values 

■ÿr (cos - + /' sin , Vr (cos ^ -|- / sin ^ ) , 

\ n n/ \ n ni 


■■■, <!/r (cos 


<f> + 2-rrjn - 1) 


+ /' sin 


<£ + 2ir{n — 1 ) 


')■ 


n 


n 
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Ail the various values of our fonction may be derived from the single 
one 

*„= +i sin &-) 

by describing a closed curve around the origin, since for each circuit 
around the origin the angle <f> will be increased by 2 jt. 

In describing the closed curve ( n — 1) times, we obtain from the first 
value of i !/z, ail the remaining (/t — 1) values. Going around the contour 
the nth time leads back to the value 

V/z. - (cos h+ 2 ” n + / sin ( cos + i sin éî) , 
\ n ni \ n ni 

i.e., we return to the original value of the root. 


Riemann surfaces for multiple-valued fonctions. There exists an easily 
visualized géométrie manner of representing the character of a multiple- 
valued fonction. 

We consider again the fonction Ln z, and on the z-plane we make a 
eut along the positive part of the axis Ox. If the point z is prevented 
from Crossing the eut, then we cannot pass continuously from one value 
of Ln z to another. If we continue Ln z from the point z„, we can arrive 
only at the same value of Ln z. 

The single-valued fonction found in this manner in the eut z-plane 
is called a single-ialued branch of the fonction Ln z. Ail the values of 
Ln z are distributed on an infinité set of single-valued branches 

ln r + i<f>, 2m < <f> ^ 2-n(n 4- 1). 

It is easy to show that the nth branch takes on the same value on the 
lower side of the eut as the (n -4- I)th branch has on the upper side. 

To distinguish the different branches of Ln z, we imagine infinitely 
many examples of the z-plane, each of them eut along the positive part 
of the axis Ox, and map onto the /ith sheet the values of the argument z 
corresponding to the /ith branch. The points lying on different examples 
of the plane but having the same coordinates will here correspond to one 
and the same number x + />; but the fact that this number is mapped 
on the nth sheet shows that we are considering the nth branch of the 
logarithm. 

In order to represent geometricaliy the fact that the nth branch of the 
logarithm, on the lower part of the eut of the nth plane, agréés with the 
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(n 4 - l)th branch of the logarithm on the upper part of the eut in the 
(n 4 - l)th plane, we paste together the nth plane and the (n + l)th, 
connecting the lower part of the eut in the nth plane with the upper part 
of the eut in the (n 4 l)th plane. This construction leads us to a many- 
sheeted surface, having the form of a spiral staircase (figure 23). The 
rôle of the central column of the staircase is played by the point z = 0. 


(n) 





Fig. 23. 


If a point passes from one sheet to another, then the complex number 
returns to its original value, but the function Ln z passes from one branch 
to another. 

The surface so constructed is calied the Riemann surface of the function 
Ln z. Riemann first introduced the idea of constructing surfaces repre- 
senting the character of multiple-valued analytic functions and showed 
the fruitfulness of this idea. 

Let us also discuss the construction of the Riemann surface for the 
function w = y/z. This function is double-valued and has a branch 
point at the origin. 

We imagine two examples of the r-plane, placed one on top of the 
other and both eut along the positive part of the axis Ox. If z starts from 
z 0 and describes a closed contour C containing the origin, then \/z 
passes from one branch to the other, and thus the point on the Riemann 
surface passes from one sheet to the other. To arrange this, we paste the 
lower border of the eut in the first sheet to the upper border of the eut 
in the second sheet. If z describes the closed contour C a second time, 
then \/z must return to its original value, so that the point in the Riemann 
surface must return to its original position on the first sheet. To arrange 
this, we must now attach the lower border of the second sheet to the 
upper border of the first sheet. As a resuit we get a two-sheeted surface, 
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intersecting itself along the positive part of the axis Ox. Some idea of 
this surface may be obtained from figure 24, showing the neighborhood 
of the point z = 0. 

In the same way we can construct a many-sheeted surface to represent 
the character of any given multiple-valued function. The different sheets 
of such a surface are connected with one another around branch points 



Fig. 24. 


of the function. It turns out that the properties of analytic functions are 
closely connected with the géométrie properties of Riemann surfaces. 
These surfaces are not only an auxiliary means of illustrating the character 
of a multiple-valued function but also play a fundamental rôle in the 
study of the properties of analytic functions and the development of 
methods of investigating them. Riemann surfaces formed a kind of 
bridge between analysis and geometry in the région of complex variables, 
enabling us not only to relate to geometry the most profound analytic 
properties of the functions but also to develop a whole new région 
of geometry. namely topology, which investigates those géométrie 
properties of figures which remain unchanged under continuous defor¬ 
mation. 

One of the clearest examples of the significance of the géométrie 
properties of Riemann surfaces is the theory of algebraic functions, i.e., 
functions obtained as the solution of an équation 

f(z, w) = 0 

the left side of which is a polynomial in z and w. The Riemann surface 
of such a function may always be deformed continuously into a sphere 
or else into a sphere with handles (figure 25). The characteristic property 
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of these surfaces is the number of handles. This number is called the 
genus of the surface and of the algebraic function from which the surface 
was obtained. It turns out that the genus of an algebraic function détermines 
its most important properties. 





§6. Conclusion 

The theory of analytic functions arose in connection with the problem 
of solving algebraic équations. But as it developed it came into constant 
contact with newer and newer branches of mathematics. It shed light 
on the fundamental classes of functions occurring an analysis, mechanics, 
and mathematical physics. Many of the central facts of analysis could 
at last be made clear only by passing to the complex domain. Functions 
of a complex variable received an immédiate physical interprétation in 
the important vector fields of hydrodynamics and electrodynamics and 
provided a remarkable apparatus for the solution of problems arising 
in these branches of science. Relations were discovered between the 
theory of functions and problems in the theory of heat conduction, 
elasticity, and so forth. 

General questions in the theory of differential équations and spécial 
methods for their solution hâve always been based to a great extern on 
the theory of functions of a complex variable. Analytic functions entered 
naturally into the theory of intégral équation and the general theory 
of linear operators. Close connections were discovered between the 
theory of analytic functions and geometry. Ail these constantly widening 
connections of the theory of functions with new areas of mathematics 
and science show the vitality of the theory and the continuous enrichment 
of its range of problems. 

In our survey we hâve not been able to présent a complété picture of 
ail the manifold ramifications of the theory of functions. We hâve tried 
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only to give some idea of the widely varied nature of its problems by 
indicating the basic elementary facts for some of the various fundamental 
directions in which the theory has moved. Some of its most important 
aspects, its connection with the theory of differential équations and 
spécial functions, with elliptic and automorphic functions, with the 
theory of trigonométrie sériés, and with many other branches of mathe- 
matics, hâve been completely ignored in our discussion. In other cases 
we hâve had to restrict ourselves to the briefest indications. But we hope 
that this survey will give the reader a general idea of the character and 
significance of the theory of functions of a complex variable. 
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PRIME NUMBERS 


§1. The Study of the Theory of Numbers 

Whole numbers. As the reader knows from the introduction to 
Chapter I, mankind had to deal even in the most ancient times with whole 
numbers, but the passage of many centuries was necessary to produce 
the concept of the infinité sequence of natural numbers 

1,2, 3, 4, 5.-. (1) 

Nowadays, in the most various questions of practical activity, we are 
constantly faced with problems involving whole numbers. Whole numbers 
reflect many quantitative relations in nature; in ail questions connected 
with discrète objects, they form the necessary mathematical apparatus. 

Moreover, whole numbers play an important rôle in the study of the 
continuous. Thus, for example, in mathematical analysis one considers 
the expansion of an analytic function in a power sériés with integra! 
powers of x 

A*) = a 0 + a x x + a^x* + — + a„x n + —. 

AU computations are essentially carried out with whole numbers, as is 
immediately obvious from even a superficial examination of automatic 
computing machines or desk calcuiators, or of mathematical tables, such 
as tables of logarithms. After these operations on whole numbers hâve 
been carried out, décimal points are inserted in weU-defined positions, 
corresponding to the formation of décimal fractions; such fractions, like 
aU rational fractions, represent quotients of two whole numbers. In 
dealing with any real number in practical work (for example, n), we 
replace it in fact by a rational fraction (for example, we assume that 
n = 22/7, or that n = 3.14). 
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While the establishment of rules for operating on numbers is the concern 
of arithmetic, the deeper properties of the sequence of natural numbers (1), 
extended to include zéro and the négative integers, are studied in the 
theory of numbers, which is the science of the System of integers and, in 
an extended sense, also of Systems of numbers constructed in some definite 
manner from the integers (see, in particular, §5 of this chapter). It is 
understood that the theory of numbers considers integers not as isolated 
one from another but as interdependent; the theory of numbers studies 
properties of integers that are defined by certain relations among them. 

One of the basic questions in the theory of numbers concerns divisibilily 
of one number by another; if the resuit of dividing the integer a by the 
integer b (not equal to zéro) is an integer, i.e., if 

a = b • c 

( a , b, c are integers) then we say that a is divisible by b or that b divides a. 
If the resuit of dividing the integer a by the integer b is a fraction, then we 
say that a is not divisible by b. Questions of divisibility of numbers are 
encountered constantly in practice and also play an important rôle in 
some questions of mathematical analysis. For example, if the expansion 
of a function in integer powers of x 

f(x) = a B + a,jc + û,jc* + ••• + a n x" + — (2) 

is such that ail odd coefficients (with indices not divisible by 2) are equal 
to zéro, i.e., if 


/(Jt) = a 0 + atX 2 + — + a ti x 2k + 
then the function satisfies the condition 

A- jc) =/(jc); 

such a function is called an even function, and its graph is symmetric 
with respect to the axis of ordinates. But if in the expansion (2) ail the even 
coefficients (with indices divisible by 2) are equal to zéro, in other words, 
if 

/(Jt) = *1* + a** 2 4- — + û ï * +1 j r** +1 + 


then 

/(- jc) = -f(x); 

in this case the function is called odd, and its graph is symmetric with 
respect to the origin. 
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Thus, for example 


X 3 X 5 

sinAr = x 

(odd function); 

, X* X * 

(even function). 

COS X = 1 


The géométrie question of the possibility of construction of a regular 
n-polygon with ruler and compass turns out to dépend on the arithmetic 
nature of the number n.* 

A prime number is any integer (greater than one) that has only the two 
positive integer divisors, one and itself. One is not considered as a prime 
number since it does not hâve two different positive divisors. 

Thus the prime numbers are 

2, 3, 5,7, 11, 13, 17, 19,23,29,-. (3) 

Prime numbers play a fundamental rôle in the theory of numbers because 
of the basic theorem: Every integer n > 1 may be represented as the 
product of prime numbers (with possible répétition of factors), i.e., in the 
form 

« = P\'P\' ■”/>!». (4) 

where />,</>*< — <p k are primes and a,, a*, —, a* are integers not 
less than one; furthermore, the représentation of n in the form (4) is unique. 

The properties of numbers connected with the représentation of numbers 
as a sum of terms are called additive; the properties of numbers relating 
to their représentation in the form of a product are called multiplicative. 
The connection between additive and multiplicative properties of numbers 
is extraordinarily complicated; it has given rise to a sériés of basic pro- 
blems in the theory of numbers. 

The existence of these difficult problems in the theory of numbers 
together with the fact that the whole number is not only the simplest and 
clearest of ail mathematical concepts but is closely related to objective 
reality hâve led to the création, for use in the theory of numbers, of 
profound new ideas and powerful methods, many of which hâve become 
important in other branches of mathematics as well. For example, a vast 
influence on ail developments of mathematics has been exerted by the idea 
of the infinité sequence of natural numbers, reflecting the infiniteness of 
the material world in space and time. Of great significance also is the fact 
the terms in the sequence of natural numbers are ordered. Study of the 


* See Chapter IV. 
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operations on integers has led to the concept of an algebraic operation, 
which plays a basic rôle in several different branches of mathematics. 

Of immense importance in mathematics has been the concept, partic- 
ularly applicable to arithmetical questions, of an algorithm, a process of 
solving problems based on the repeated carrying out of a strictly defined 
procedure; in particular, the rôle of the algorithm is fundamental to the 
use of mathematical machines. The essential nature of the algorithmic 
method for solving a problem is clearly illustrated by the Euclidean 
algorithm for finding the greatest common divisor of two natural numbers 
a and b. 

Suppose a > b. We divide a by b and find the quotient g, and, if b does 
not divide a, the remainder r 2 

a = bq x + r t , 0 < r 2 < b. (5 t ) 

Further, if r 2 0. we divide b by r t 

b = r t q t + r* 0 < r, < r 2 . (5 2 ) 

Then we divide r 2 by r s and continue until we get to a zéro remainder, 
which must necessarily happen for a decreasing set of nonnegative integers 
r t> r s< Let 

r n-i = r n-iQn-i "F r n > (^n-j) 

r n -1 = r nRn . (5 n ) 

then r„ is at once seen to be the greatest common divisor of a and b. For if 
two integers / and m hâve a common divisor d, then for any integers h 
and k the number hI + km will also be divisible by d. Let us dénoté the 
greatest common divisor of a and b by S. From équation (5,) we see that 
S is a divisor of r 2 ; from (5 2 ) it follows that S is a divisor of r s , •••; from 
(5„_,) that S is a divisor of r„ . But r„ itself is a common divisor of a and b , 
since in (5„) we see that r n divides from (5 n _,) that r n divides r„_ 2 , 
etc. Thus S is identical with r„ and the problem of finding the greatest 
common divisor of a and b is solved. We hâve here a well-defined 
procedure, of the same type for ail a and b, which leads us automatically 
to the desired resuit and is thus a characteristic example of an algorithm. 

The theory of numbers has exerted an influence on the development of 
many mathematical disciplines: mathematical analysis, geometry, classical 
and contemporary algebra, the theory of summability of sériés, the theory 
of probability, and so forth. 

Methods of the theory of numbers. In its methods, the theory of 
numbers is divided into four parts: elementary, analytic, algebraic, ahd 
géométrie. 
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The elementary theory of numbers studies the properties of integers 
without calling on other mathematical disciplines. Thus, starting from 
Euler’s identity 

(4 +x\ + xl+ JcîKyî + y\ + y» + yî) = (-XlV, + + x^y 3 + x^f 

+ (■’fiTs - xtft + - x^ + (x,y s - Xjj-, + xtf t - x^J 1 

+ {xj A - x^y 1 + Xty a - x a ytf, (6) 

we may very simply prove that every integer N > 0 may be expressed as 
the sum of the squares of four integers; i.e., every integer is representable 
in the form 

N = X* + y 2 + Z* 4- U\ 
where x, y, z, and u are integers. * 

The analytic theory of numbers makes use of mathematical analysis for 
problems of the theory of numbers. Its foundations were laid by Euler 
and it was developed by P. L. CebySev, Dirichlet, Riemann, Ramanujan, 
Hardy, Littlewood, and other mathematicians, its most powerful methods 
being due to Vinogradov. This part of the theory of numbers is closely 
connected with the theory of functions of a complex variable (a theory 
that is very rich in practical applications), and also with the theory of 
sériés, the theory of probabiiity, and other branches of mathematics. 

The basic concept of the algebraic theory of numbers is the concept of 
an algebraic number, i.e., a root of the équation 

f" + a l x n ~ l + a a x n - i + — + a n _ x x + a n = 0, 

where a 0 , a,, a t , —, a„ are integers.t 

The greatest contributions to this branch of the theory of numbers were 
made by Lagrange, Gauss, Kummer, E. I. Zolotarev, Dedekind, A. O. 
Gel’fond, and others. 

The basic objects of study in the géométrie theory of numbers are 
“space lattices”; that is, Systems consisting entirely of “intégral” points, 
ail of whose coordinates in a given rectilinear coordinate System, rectan- 
gular or oblique, are integers. Space lattices hâve great significance in 
geometry and in crystallography, and are intimately connected with 
important questions in the theory of numbers; in particula., with the 


* We hâve here an example of an indeterminate équation, to be investigated from 
the point of view of its solvability in integers. 

tlf a* = 1, the algebraic number is calied an algebraic integer. A number which 
is not algebraic is calied transcendental. 
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arithmetic theory of quadratic forms, i.e., the theory of quadratic forms 
with integer coefficients and integer variables. Basic work in the géométrie 
theory of numbers is due to H. Minkowski and G. F. Voronol. 

It is-to be noted that the methods of the analytic theory of numbers 
hâve important applications in the other two branches, the algebraic 
and the géométrie. Particularly noteworthy is the problem of counting 
the number of intégral points in a given domain, a problem which is 
important in certain branches of physics. Various means of approach to 
this problem were indicated by G. F. Voronol and methods for its solution 
were developed by I. M. Vinogradov. 

The deep-lying reason for the power of analytic methods in the theory 
of numbers is that they enrich our study of the interrelations among 
discrète integers by summoning to our aid new relations among continuous 
magnitudes. 

We must emphasize that in this chapter we are considering only certain 
selected questions in the theory of numbers. 


§2. The Investigation of Problems Conceming Prime Numbers 

The number of primes is infinité. In considering the sequence (3) 
of prime numbers 

2, 3, 5, 7, 11, 13, 17, 19, - 

it is natural to ask the question: Is this sequence infinité? The fact that any 
integer can be represented in the form (4) does not yet solve the problem, 
since the exponents a,, -, a k may take on an infinité set of values. An 
affirmative answer to the question was given by Euclid, who proved that 
the number of primes cannot be equal to any finite integer k. 

Let p x ,Pi, ••■,Pk be primes; then the number 

m — PlPl Pk + 1. 

since it is an integer greater than one, is either itself a prime or has a 
prime factor. But m is not divisible by any one of the primes p x ,p i ,--\p k 
since, if it were, the différence m — ••• p k would also be divisible by 

this number; which is impossible, since this différence is equal to one. 
Thus, either m itself is a prime or it is divisible by some prime p k+1 , 
different from p x , —,p k . So the set of primes cannot be finite. 

The sieve of Eratostheues. The Greek mathematician Eratosthenes in 
the 3rd century B.C. descri bed the following “sieve” method for finding 
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ail the primes not exceeding a given natural number N. We write ail the 
integers from 1 through N 

1, 2, 3, 4, —, N, 

and then cross out, from the left, first the number 1, then ail numbers 
except 2 that are multiples of 2, then ail except 3 that are multiples of 3, 
and then ail except 5 that are multiples of 5 (the multiples of four hâve 
already been crossed out), and so forth; the remaining numbers will then 
be primes. It is worthy of note that the process of Crossing out needs to 
be^ontinued only to the point where we hâve found ail primes less than 
VN, since every composite number (i.e., not prime) that is not greater 
than N will necessarily hâve a prime divisor not exceeding VN. 

Examination of the sequence of prime numbers in the sequence of ail 
positive integers would lead us to believe that the law of distribution of 
prime numbers must be very complicated; for example, we encounter 
primes such as 8,004,119 and 8,004,121 (the so-called twin primes) 
whose différence is two, and also primes that are far from each other, 
such as 86,629 and 86,677, between which there is no other prime. But the 
tables show that "on the average” prime numbers occur more and more 
rarely as we traverse the sequence of integers. 


Euler’s identity; his proof that the number of primes is infinité. The 

great 18th century mathematician L. Euler, a member of the Russian 
Academy of Sciences, introduced the following function, with argument 
s > 1, which at the présent time is denoted by £(j): 

Z(s)=\+-^+-p- + -+-^ + ". (7) 

As we know from Chapter II, this sériés converges for s > 1 (and 
diverges for s < 1). Euler derived a remarkable identity that plays a very 
important rôle in the theory of prime numbers: 





( 8 ) 


where the symbol n p means that we must multiply together the expressions 
1/[1 — (1 /p’)\ for ail primes p. To see how the proof of this identity goes, 
we note that 1/(1 — q) = 1 + q + q 1 + — for | q \ < 1, so that 



l+ 7 r+ 7 r + 
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Multiplying these sériés for the various primes p and recalling that every n 
is uniquely representable as the product of primes, we find that 

n(‘ + yr + yr + -) = 1 + + + + TF + ’ 

For a rigorous proof, of course, we must establish the validity of our limit 
process, but this présents no particular difficulty. 

From identity (8) we may dérivé as a corollary the fact that the sériés 
\/p, consisting of the reciprocais of ail the primes, diverges (this pro¬ 
vides a new proof of the fact already known to us that the prime numbers 
cannot be finite in number), and also that the quotient of the number of 
prime numbers not exceeding x , divided by x itself, converges to zéro for 
unboundedly increasingx. 


The investigations of P. L. Cebysev on the distribution of the prime 
numbers in the sequence of natural numbers. We dénoté by n(x), as is 
now customary, the number of prime numbers not exceeding x; for 
example, w(10) = 4, since 2, 3, 5, and 7 are ail the primes not exceeding 
10, -Mjt) = 2, since 2 and 3 are ail the primes not exceeding n. As noted 
earlier 


lim 


v{x) 


= 0. 


*-ce X 


But just how does the ratio tt{x)Ix decrease; in other words what is the 
law of growth for tt(j» r)7 May we look for a fairly simple, well-known 
function that differs only a little from tt(x)? The fa mous French mathe- 
matician Legendre, in considering tables of prime numbers, stated that 
such a function will be 


ln x — A 


(9) 


where A = 1.08---, but he did not give a proof of this proposition. Gauss, 
who also considered the question of the distribution of the prime numbers, 
conjectured that M,x) differs comparatively little from J 1 dl/ln t (we note 
that the following relation holds: 

çjl 

lim - * - = 1, (10) 

i-«o X 

ln x 

which is established by integrating by parts and finding estimâtes for the 
new intégral). 
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The first mathematician since the time of Euclid to make real progress 
in the very difficult question of the distribution of the prime numbers 
was P. L. CebySev. In 1848, basing his work on a study of Euler’s function 
£(j) for real s, CebySev showed that for arbitrarily large positive n and 
arbitrarily small positive a there exist arbitrarily large values of x for which 



and also arbitrarily large x for which 


BUC 

ln" x ’ 



ax 


ln" x ’ 


which is in good agreement with Gauss’s assumption. In particular, 
taking n = 1 and applying (10), CebySev established the fact that 


*-MO X 

In x 


(II) 


provided that the limit in (11) exists. 

Cebyücv also refuted Legendre’s assumption concerning the value of the 
constant A which occurs in expression (9) as giving the best approximation 
to rr(x); he showed that this value can only be A = 1. 

The well-known French mathematician Bertrand was led by his 
investigations in the theory of groups to the following conjecture, which 
he verified empirically from the tables up to quite large values of n: If 
n > 3, then between n and 2n — 2 there is at least one prime. AU the 
attempts of Bertrand, and of other mathematicians, to prove this con¬ 
jecture proved fruitless until 1850, when CebySev published his second 
article on prime numbers, in which he not only proved the conjecture 
(“Bertrand’s postulate”) but also showed that for sufficiently large x 


A,<^-<Ai, (12) 

In x 

where 

0.92 < A t < 1 and 1 < A 2 < 1.1. 

In §3 we give a simplified présentation of tebySev’s method, which 
leads, however, to considerably less précisé results than those of CebySev 
himself. 
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CebySev’s works had a great influence on many mathematicians, in 
particular Sylvester and Poincaré. In the course of more than forty years 
a number of scientists busied themselves with the improvement of 
CebySev’s inequality (12) (increasing the constant on the left side of the 
inequality. and decreasing the constant on the right side), but they were 
unable to establish the existence of the limit 


lim 

X-KD 


*Kf) 

x 

In x 


(as was pointed out previously, we know from the work of CebySev that 
if this limit exists it is equal to one). 

Only in 1896 did Hadamard, using arguments from the theory of 
functions of a complex variable, prove that the function @(x), introduced 
by CebySev and defined by the équation 


&(x) = X ln P' 


satisfies the condition 


lim^-1. 

Ï-KO X 


03) 


from which it is relatively easy to obtain the relation (11) without any 
further assumptions; this is the so-called asymptotic law for the distribu¬ 
tion of primes. 

The resuit (13) was found by Hadamard on the basis of the investigations 
by the famous German 19th century mathematician Riemann, who 
studied the Ç(s) function of Euler (7) for complex values of the variable 
s — a + it (CebySev himself had considered this function only for real 
values of the argument).* 

Riemann showed that the function £($), defined in the half plane a > 1 
by the sériés (7) 


« 1 


has the property that 



*In 1949 A. Selberg gave an clementary proof (i.e., not using complex variables) 
of the asymptotic law of distribution of primes. 
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is an entire transcendental function (for a < 1 the sériés (7) ceases to 
converge, but the values of Ç(s) in the half plane a ^ 1 are defined by 
analytic continuation) (see Chapter IX). Riemann made the conjecture 
(“the Riemann hypothesis”) that ail roots of Ç(s) in the strip 0 < a < 1 
hâve real part equal to 5 , i.e., lie on the straight line a = | ; the question 
of the correctness of this assumption remains open to this day. 

An important step in the proof of (13) was the establishment of the fact 
that on the straight line 0 = 1 there are no roots of £(j). 

The investigation of the behavior of £(j) led to the development of an 
élégant theory of entire and meromorphic functions, with important 
practical applications. 


The work of Vinogradov and his students in the theory of prime numbers. 

From équation (13), which by (10) may be written in the form 


lim 

X -KO 


”(*) 



= 1 , 


(14) 


thère arose the question of the degree of exactness with which the function 
dl/\n t represents n(x). The best results in this direction were found by 
N. G. Cudakov and were based on Vinogradov’s method of trigonométrie 
sums (this method will be described in §4), which also allowed Cudakov 
to decrease considerably the bounds between which we can find at least 
one prime. Namely, it had been established previously that if we consider 
the sequence 

l 240 , 2 240 , 3 240 , -, n 240 , (n + l) 240 , -, (15) 


then, starting with some n = n 0 , there must exist, between any two 
adjacent terms, i.e., between n 240 and (n + l) 240 , at least one prime. 

We note that, as follows from the binomial formula 

(n + l ) 240 - n 260 > 250i M », 

this différence is very large. N. G. Cudakov succeeded in replacing the 
sequence (15) by 

1*. 2 4 , 3 4 , —, n*, (n + l) 4 , —, (16) 

whose terms lie considerably doser together than those of the sequence 
(15) but which also contains at least one prime between every two suc¬ 
cessive terms, i.e., between n* and (n + l) 4 , beginning at some n = n 0 . 
Subsequently, this resuit has been improved by replacing the fourth 
powers by cubes. 
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If k and / are relatively prime, i.e., hâve no common divisor larger than 
one, then an arithmetic progression with general terni kt + / contains 
infinitely many prime numbers. This fact, a generalization of the resuit of 
Euclid, was established in the I9th century by Dirichlet. But can we find 
a bound that will certainly not be exceeded by the smallest prime in the 
progression kt + /? The Leningrad mathematician Ju. V. Linnik proved 
the existence of an absolute constant C with the property that in progres¬ 
sion kt + I (k and / relatively prime) there necessarily exists at least one 
prime less than k c . Thus Linnik provided an essentially complété solution 
of the problem, raised many years before, of the least prime in an arith¬ 
metic progression; further investigators can only decrease the value of 
the constant C. Linnik also carried out very important investigations 
concerning the zéros of the function £(r) and more general functions. 

As mentioned previously, the best results with regard to the distribution 
of primes were found by the method of Vinogradov for estimating trigono¬ 
métrie sums. 

A trigonemetric sum is a sum of the form 

V e *n(/tx) 

A<x<B 

where f(x) is a real function of x, and x takes on ail intégral values between 
A and B, or some spécifie subset of these values, for example the primes 
between A and B. Since the modulus of e*"" for real z is equal to one, and 
the modulus of a sum does not exceed the sum of the moduli of its terms, 
we hâve 

I J «?*""'*> I ^ P. (17) 

1 x-l 1 

This “trivial” estimate can be improved considerably in a number of 
cases; the décisive steps in this direction were taken by Vinogradov. For 
definiteness, let ]\x) be a polynomial 

f(x) = evr» -I- ot,.,**-» + - + a,x + «O • 

If ail the a are integers, then é v ' inz) = 1 for intégral x, and in this case 
the estimate (17) obviously cannot be improved. But if a,, —, a„ are not 
ail integers then, as Vinogradov showed, the estimate (17) may be sharp- 
ened by approximating any of these coefficients by rational fractions with 
denominators not exceeding some bound (it may be shown that any a 
lying between 0 and 1 is representable in the form a = a/q + z, where a 
and q are relatively prime integers, q ^ r, | z | < \/q’ and r is a pre- 
assigned integer greater than 1 ). 
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The création of the method of trigonométrie sums by Vinogradov allowed 
him to solve a sériés of very difficult problems in the theory of numbers. 
In particular, in 1937 he solved a famous problem stated by Goldbach, 
by proving that every sufficiently large odd N is representable as the sum 
of three primes 

N ~ Pi + Pt + Pi • (18) 

This problem arose in 1742 in correspondence between Euler and another 
member of the Russian Academy of Sciences, C. Goldbach, and remained 
unsolved for almost two centuries, despite the efforts of a number of emi- 
nent mathematicians. 

As we hâve seen, the inequality (4) shows that prime numbers play a 
fundamental rôle in the multiplicative représentation of an odd number 
by means of primes. It is easy to show from (18) that one can represent a 
sufficiently large even number as the sum of no more than four primes.* 
In this manner, the Vinogradov-Goldbach theorem established a profound 
connection between additive and multiplicative properties of numbers. 

The signiAcance of the method of trigonométrie sums created by 
Vinogradov is not restricted to the theory of numbers. In particular, it 
plays an important rôle in the theory of functions and in the theory of 
probability. Some idea of Vinogradov’s method may be obtained from 
§4 of this chapter. 

Readers who are interested in a more detailed treatment may consult 
Vinogradov’s book “The method of trigonométrie sums in the theory of 
numbers,” after a preliminary reading of his book “Foundations of the 
theory of numbers.” 


§3. CebySev’s Method 

CebySev’s O function and its estimâtes. We now give a simpliAed 
présentation of CebySev’s method for computing the number of primes 
lying with given limits. For brevity we agréé to use the following notation: 
if B is a positive variable quantity that may grow unboundedly, and A is 
another quantity such that | A | grows “no more rapidly” than CB, where 
C is a positive constant (more precisely, if there exists a constant C > 0 
such that starting from some instant we always hâve | A \/B < C), then we 
will write 

A = O(B). 


* The correctness of Euler's conjecture that evety sufficiently large even number 
N can be represented as the sum of two primes remains an open question to this day. 
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This is usually read as: 
example 

since everywhere 


“A is a quantity of the order of B.” Thus, for 
sin x = 0(1), 

I sin x | 


in exactly the same way 


5x* cos 2x = 0(x*). 


We will also dénoté by [x] the intégral part of x, i.e., the largest integer not 
exceeding x; thus, for example 

H = 3, [5] = 5, [- 1.5] = - 2, [0.999] = 0. 

We now pose the following question: Let p be a prime, and n a natural 

number, and let n!, as usual, dénoté the product 1 -2-3 .n; we note 

incidentally that as n increases the value of n! grows very rapidly. What is 
the largest power a of the prime p that divides n\ with no remainder? 

Among the numbers 1, 2, —, n, there will be precisely [n/p] numbers 
divisible by p; the number of these which will also be divisible by p 2 is 
[n/p 2 ]; further, of these there will be [n/p 3 ] divisible by p 3 , etc. Henceitis 
easy to show that 



(where the sériés terminâtes, since [ n/p‘] > 0 only for n ^ p’). Thus, in 

the last sum every factor of the product 1 -2-3 . n such that thehighest 

power of the number p by which it is divisible is equal to p m will occur 
precisely m times, once as a multiple of p, once as a multiple of p 2 , once as 
a multiple of p 3 , —, and finally once as a multiple of p m . 

From this resuit and from the representability of any natural number 
in the form (4) it follows that n! will be the product of powers of the form 



pif , 

taken for ail primes p ^ n. Thus 
ln (n!) will be the sum of the loga- 
rithms of these powers, which can be 
concisely written in the form 

ln '" = J.(0 + [^] 

+ [-£-] + ■") ,n P- 09) 


Fig. 1. 
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We simplify équation (19). Since y = ln x is an increasing function, 
we hâve 

j.tn+1 j-m+l j.m+1 

ln m — ln m dx < ln x dx < ln(m + 1 ) dx = ln(m + 1 ) 

J m J m •' in 

as is clear from figure 1. Thus 

ln n\ = ln 1 + ln 2 + ••• + ln n 

< J ln x dx -f J 3 In x dx + ••• 

+ f ln x dx + ln n = f ln x dx + ln n, 

J n-l J i 

on the other hand 

ln n! > ln 1 + f ln x dx + • • • 

+ f ln x dx + (" In x dx = f ln x dx. 

Using the formula for intégration by parts, we find 

r" r 1 

ln x dx = [x • ln *]" — x • - • dx = n ln n — (n — 1 ). 

J, J t x 


n ln n — n + 1 <In«! < n ln a — n + 1 +lnfl, 

and hence it follows that 

In a! = a ln a + O(n). (20) 

We note that ln n = O(n); further, for n-+<x>, the function ln n increases 
more slowly than any positive power of n, i.e., for any constant a > 0 

lim JüJL = o, ( 21 ) 

since by the rule for indeterminate forms (cf. Chapter II) 


ln a .. a 1 .. 1 . 

lim- -- lim-= - lim-= 0. 

fl» Ofl»- 1 a fl° 
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Further, we find 

X ([>M >]-)'+ + >' 


= X 


n \np 


ln m 




ü < 2" 2 -TÉ- < 2 » s -=?■ = 2 »c. - «»>. 




( 22 ) 


ln wi 

where C 0 is the sum of the convergent sériés V —. The absolute 

m —1 ^ 

convergence of this sériés is estabiished by using (21), for example, for 
a = £, by the comparison test and the so-called intégral test for conver¬ 
gence (cf. Chapter II, §14). In view of (20) and (22), équation (19) may be 
put in the form 

X fÜ] ln p = " ln " + °w- ( 23) 


We now consider the function introduced by CebySev 

0(i.)=2)ln/> (24) 

P«n 

(the logarithm of the product of ail prime numbers not exceeding n). 

Equation (23) can be rewritten as: 

0 (ï) + e ® + 0 © + 9 © + -=« ln I» + m- (25) 

In fact, every given ln p enters into ail the sums of the form 8(n/s), 
where p < n/s, i.e., where s < n/p, and the number of such sums Sin/s) 
is equal to [n/p]. 

Equation (25) is also valid for any noninteger n. To see this, it is 
obviously sufficient to prove that it is true for ail x under the condition 
n < x < n 4- 1 ; and for this it is enough to prove that replacing « by x 
in the left side of (25) does not change that side, and that the first term in 
the right side may increase by an amount which is O(n). But the first follows 
from the fact that such a replacement will not increase the value of any 
one of the terms of the left-hand side (such an increase would be possible 
oaly if n were increased by more than unity) and, of course, the left side 
is not decreased. The second follows from the fact that by the formula 
for the incrément of a function (cf. Chapter II) 

f(x) -/(fl) =(x - a)f'(0, a <(; <x. 
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x ln x — n ln n = (* — n) • (ln f + 1 ), n < f < x, 

and the right side of this last équation is less than ln (n 4- 1) + 1 = 0(n), 
sinceO < x — n < 1. From équation (25) let us subtract twice the équation 
derived from (25) by replacing n by n/2; 

0 (y) + B g) + 9 g) + 8 g) + ••• = /» ln n + 0(n), 

28 g) + 2 ® G) + “ 2 • 5 ' 1 " 5 + 0(n), 

we obtain 

•g) - ® g) + «g) - «© + - = n\n2 + 0 («) < O, 

where C, is some positive constant. But 8(n/ 1) — 8(n/2) is not larger 
than the whole left side, since the différences, 8(n/ 3) — 8{n/ 4), 
8{n/5) — 8(n/6 ), ••• cannot be négative. Thus it follows from this last 
inequality that 



Inserting here the numbers n/2, n/A, ••• in place of n, we also get 



hence, using the fact that 8(n/2 k ) = 0 for sufficiently large k (when 
n/ 2 * < 2 ), addition of terms gives 

8{n) < C, (n + \ + l + -) = 20 (26) 

Returning to équation (23), we find 

0 < X ~ n {n P ~ X 0 ln P < X [n P = ®(») < 2C i" = 0(n), 

p^nP V P* PS» 




216 


X. PRIME NUMBERS 


so that équation (23) gives 

X-lnp = nln/j + 0(n), 
p%P 

X — = ^ « + OC, (27) 

P«n P 

where C is a constant greater than zéro and 8 dépends on the number n 
in such a manner that | 8 \ ^ 1. 


An estimate for the number of primes in a given interval. We now 

show that one may choose a positive constant M in such a manner that 
between n and Mn there will lie as many primes p as desired, if n is 
sufficiently large. Namely, we establish simple inequalities for the number 
T of primes in the interval n < p < Mn. Obviously, 


y, ïnp 

n<pgMn P 


= S 

P SM« P PS» 


Z I *P 

> P 


From équation (27), replacing n by Mn, we get 


= ln (Mn) 4- 8'C = ln M + ln n + 8'C, (29) 


where | 0' | < 1; thus, in view of équations (28), (29), and (27), we hâve 
X = ln M + 8'C - 8C = ln M + 2 8 0 C, 

n<p «Mn P 

where | 8 0 \ < 1, i.e., 

ln M - 2C < X < ln M 4- 2C. (30) 

rt<p$Mn P 


On the other hand, since y = ln x/x for x > e is a decreasing function 
(since ÿ = (1 — ln jr)/x* < 0 for ln x > 1, i.e., x > e), it follows that 
for n ^ 3 

T In Mn < y lnp < ^ ln n 
Mn " P n 


hence, from (30), we hâve 


^ M - 2C 
n 
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and 


T ln (Mn) 
Mn 


< In M + 2 C. 


(32) 


We now choose the constant M such that the right side of (31) is equal to 
one 


ln A/ — 2C = 1, 


i.e., 

and we set 


M = e* c+I , 

L = A/(ln A/ + 2C). 


Then for the number T of primes lying between n and Mn, we get from 
(31) and (32) the inequalities 


n 

ln n 


< T < L 


n 

ln n ’ 


(33) 


which it was our purpose to establish. Since n /ln n-*oo for unbounded 
increase in n, it follows that T-*co also. 


§4. Vinogradov’s Method 

Vinogradov’s method in its application to the solution of Goldbach’s 
problem. We attempt in this section to give some account of Vinogradov’s 
method for the particular case of Goldbach’s problem of representing an 
odd number as the sum of three prime numbers. 

An expression in the form of an integraI for the number of représentations 
of N as the sum of three primes. Let N be a sufficiently large odd number. 
We dénoté by /(V) the number of représentations of N as the sum of three 
primes; in other words, the number of solutions of the équation 

N = Pi + Pt + Ps (34) 

in prime numbers p x , , and p 3 . 

Goldbach’s problem will be solved if it can be established that /(N) > 0. 
Vinogradov’s method allows us not oniy to establish this fact (for 
sufficiently large N), but also to find an approximating expression for 

W 

I(N) may be written in the following form 

f(N)= T V V f 1 **"«*,+*+*-*»« dcc, 

p, J » 


(35) 
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where the summations are taken over the prime numbers not exceeding N. 
In fact, for integer n^E 0 


çlninz da _ 

i o 2 nm 




since 

but if n — 0, then 


e 2 "'" _ cos 2„ n + / s j n jnn = 1 ; 

f da = Ç da= 1. 

* 0 •'O 


e°) = 0, 


Thus, every time the primes p x , p 3 , and p 3 hâve the sum N the intégral 
inside the summation sign in (35) has the value one, and when the sum 
Pi + Pt + Pa 9^ N, this intégral is equal to zéro, which proves the validity 
of équation (35). 

Since e îwia • e Mk = e în,ia+b) and the intégral of a sum of terms is equal 
to the sum of the intégrais of these terms, it follows from équation (35) 
that 

/(A0= Ç (X **"H* e-**'* doc. 

Introducing the notation 

r« - y e 2 ’ 13 " 

PÇN 

we then hâve 

I(N)= f‘ T*e~ u,aN da. 

J o 


(36) 

(37) 


Décomposition of the inter val of intégration into basic and complementary 
intervals. Let h be a quantity, chosen in an appropriate manner depending 
on N, which increases unboundedly with N but is small in comparison with 
N and even with i/JTjî, and set t = N/h. Since the function integrated 
in (37) has a period equal to one, the interval of intégration in (37) may be 
replaced by the segment from — (1/r) to 1 — (1/t). Thus 


,i-ï/i 

I(N) = Tle- MaN doc. (38) 

J-l/r 

We now consider ail proper irreducible fractions a/q with denominators 
not exceeding h, and distinguish in the segment — (1/r) < a < 1 — (1/t) 
the “basic” intervals corresponding to these fractions 


a 

<7 


11 

T 



( 39 ) 
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for sufliciently large N these intervals, as may be proved,* will hâve no 
points in common. In this manner, the segment — (1/r) ^ a ^ 1 — (1/r) 
can be decomposed into basic intervals and “complementary” intervals. 
We represent I(N) as the sum of two ternis 

/(AO = UN) + UN), (40) 


where /,(A0 dénotés the sum of the intégrais on the basic intervals and 
It(N) is the sum of the intégrais on the complementary intervals. As will 
be scen below, for unbounded growth of odd N we also hâve unbounded 
growth of /,(A0, with 


lim 

yv-*«o 


UN) 

UN) 


= 0. 


(41) 


So we see from (40) that the number of représentations of an odd N as the 
sum of three primes grows unboundedly with N, so that, in particular, 
we hâve proved Goldbach’s conjecture for ail sufficiently large odd N. 

An expression for the integraI on the basic intervals. Leta belongto 
one of the basic intervals; from (39), a = a/q + z, where 1 < ? < /r and 
| z | < 1/t. We break up the sum (36) 


r« = y e u,pl = T e ulio,,1+ ‘ ,p , 

v£n psw 


extended over ail primes not exceeding N into partial sums 
form 


t;.m = £ 


f2«ita/«+t)p 


T a M of the 


where M 'is so chosen that e Msp differs “little” from e Ui,M \ since we intend 
to give only the idea of Vinogradov’s method, and not a proof of the 
Goldbach-Vinogradov theorem, we will not State precisely what we mean 


* If two such intervals surrounding the points ajq, and a,/q, intcrsect, then at a 
common point we will hâve the équation 


— + — . where I 6, | < 1, 1 B, | < 1, 

Qt * 

a \Qt — a iQ\ _ 8 i — 8-i 

QiQt r 

But the absolute value of the left side of this last équation is not less than 1/?,<?,, i.e., 
is greater than 1/A*. and the right side is not greater than 2/t, i.e., is less than 2 Ji/N. 
So if this last équation were true, it would imply the inequality 1 /A* < 2 h/N which 
contradicts the choice of h. 


fi + fi 
<h r 
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by the expression “differs little”; in his proof Vinogradov deals with 
rigorously defined inequalities, invoiving a great deal of calculation. Thus 

V T a „ M , (42) 

M' 


where the Symbol «5 means that the first of the three expressions on the 
last relation differs “little” from the second. 

We further break up each of the sums 

Ta /,.* = X eiWi{a ' Q)V (43) 


into sums T a ,„ M . t , taken over ali primes p, satisfying the relation 
M ^ p t < M' and belonging to arithmetic progressions qx -f /, where / 
takes on ail values from 0 to q — 1 which are relatively prime to q. But 

e 2"italqlpl _ e înix+2ni(a/q)l — gMlalatt* 

and thus 

= e™'*'*' 1 ■ v{M, M\ I), (44) 

where tt(M,M',I) is the number of primes satisfying the conditions 
M < p < M' and belonging to the arithmetic progression qx /. In the 
development of formula (14) for the number n{x) of primes not exceeding 
x, it was established that M.M, M', /), for values of q which are “small” 
in comparison with the différence M' — M, differs little from 
1/^0?) dx/ln x, where is Euler’sfunction. This is a number-theoretic 
function (i.e., a function defined for natural numbers q) representing the 
number of positive integers not exceeding q and relatively prime to q. 
From (44) we may thus dérivé 


1 alq.M'l 


e înUa/v)l 


(45) 


In the expression on the right side of (45), only the first factor dépends on 
/, i.e., on the choice of the arithmetic progression qx + / (we now consider 
q as fixed). After summing on /, we obtain 


I dx v 




and further, from (42), 


TaM *= • 


<Kq) 


r*L. y 

■* » ln * I 


e 2*i(a/Dl 


(46) 
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where 


~UiMx 


f* dx e mtz , 

---- dx, 

J u \nx ln x 

which allows us to replace (46) by the relation 

r M' gî’itx I 

t> m « f — dx • --- y ***«•/»«. 
J a# ln jc <HÇ) T 

After summing on M it is established that 


r M’ y>2nizx 


(47) 


T x ps 


In x 


dx 


-J. - y 

m y 




The sum 



e i«l(a/Dl ) 


(48) 


occuring on the right side of (48), with the summation taken over natural 
numbers / not exceeding q and relatively prime to q may be expressed as a 
number-theoretic function n(q) defined in the following manner: n(q) = 0 
if q is divisible by the square of an integer greater than one; /x(l) = 1 
and /a(^) = (— l) n if q=p y p ï — p n where p x ,Pt,—,Pn are distinct 
primes. Thus, for relatively prime a and q 




(49) 


Thus équation (48) may be written in the form 


T .„jg!LfÇL Jx . 

<Kq) K in jr 


From the fact n\q) = n(q) we hâve 


T 3 ^ 


M?) l Ç N , \* 


a n ptvizx 


(m? 

and from the définition of I y (N) 

h(N) = T Xf T*e-***d* t 

x -fT* k _ J nin-Mx 


(50) 


( 51 ) 
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where for a given q the summation is taken over ail nonnegative a less 
than q. Since a = a/q = z, we then hâve, as a resuit of (50), 




,1/x . ,N gtnitx ,3 


lss«<A 

We introduce the notation 


(52) 


J/t , t N pivitz ,3 

*<*>-f.JJ, H^ <»> 

From relation (52) it follows that 

h{N) R(N) ^ X (54) 


«<«<» 


Here we must draw attention to the fact that R(N) is an analytic expression, 
which can therefore be calculated approximately; in fact, it runs out that 


R(N) « 


N* 

2(ln >V) S ' 


(55) 


The expression occuring as a factor of R(N) on the right side of (54) differs 
“little” from the sum of the infinité sériés 


S(N) = S 1 MF ? e ~ Mta,t ' N > (* > 


so that, from (54) and (55), it can be established that 


sm - 

(57) 

or, more precisely. 



(58) 

where 


lim y,(<V) = 0. 

/v-*cc 

(59) 


We note that number-theoretic expression S(N) may be written in the 
form 


( 60 ) 
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where Cis a constant, the multiplication is extended over ail prime divisors 
of the number N, and, as the computations show, 

S(N) > 0 . 6 . ( 61 ) 

Estimate of the intégral on the complementary intervals. We turn now 
to an estimate of the sum / 2 of the intégrais on the complementary intervals. 
Since the modulus of the intégral does not exceed the intégral of the 
modulus of the function being integrated, and since | e | = 1 for 
real aN, wc hâve 

I /, I < max | r. | • I T x |* doc , ( 62 ) 

where max | r» | represents the largest value of | T x | for a belonging to 
the complementary intervals (we hâve strengthened the inequality by taking 
as the factor of max | T a | the intégral extended over the whole interval 
— (I/t) < 1 — 1/t). 

But the square of the modulus of a complex number is equal to the 
product of the number with its complex conjugate, so that 

I T a 1* = T. • 7 a , 


where from (36) we hâve 


T - = 2 . 


P<1V 


ç-2 nizp 


since e~ tnlap = cos 2mxp — / sin 2-na.p. Thus, inequality (62) may be 
rewritten in the form 

I /, | < max| r a | • f‘ ,/r V e™*” T e^’^doc 

J _1/t p,^N 

or in the form 

| /, | < max | T a | • 2 2 da ■ ( 63 > 


But the intégral in the inequality (63), from what was said at the beginning 
of the présent section, represents the number of U of solutions in primes 
p, pt , not exceeding N, of the équation p — p, = 0, or simply the number 
of primes not exceeding N, i.e., n(N). From the resuit (12) of CebySev we 
hâve 

N 

In N ’ 


n(N) < B ■ 
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wherc B is a constant. In this manner 


!/.!<*■ 


N 

In N 


■ max | T, |, 


(64) 


where, to repeat, max | T z | rcprcsents the largest value of | 7^ | on the 
complementary intervals. From (58) and (59) it follows that in order to 
complété the proof of the Goldbach-Vinogradov theorem, we must now 
show that max | T a | has order less than yV/(ln /V) 2 ; however, the establish¬ 
ment of this fact présents the greatest difficulty and constitutes the essential 
part of the whole proof of the theorem. 

Every a belonging to a complementary interval can be represented in the 
form a = a/q + z, where h < q < t and | z | < l/qr. The problem thus 
consists of estimating the modulus of the trigonométrie sum 

7^= V ginUa/Q+ttv 


under the given conditions. Vinogradov established, in particular, that 

| im J2iiZL = 0; (65) 

N-*o \ 

(In N) 3 

here he made use of a very important identity which he discovercd for the 
function n(n) discussed previously. 

Unfortunately, it is not possible here to give a proof of équation (65); 
the interested reader is referred to Chapter X in Vinogradov’s book 
“Methods of trigonométrie sums in the theory of numbers.” 

From (65) and (64), as we noted, it follows that 




In this manner, from (40), (58), and (59) we hâve 


where 


/(/V) = 


N* 

2(ln N) 3 


[S(N) + y (A0], 


lim y(N) = 0, 

N -KO 


( 66 ) 


and S(N) has the value (60), so that, from (61), S(N) > 0.6. This complétés 
the proof of the theorem. 
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§5. Décomposition of Integers into the Sum of Two Squares; 

Complex Integers 

The importance of the study of prime numbers is chiefly because of the 
central rôle they play in most of the laws of number theory: It frequently 
happens that questions which at first sight seem far removed from divi- 
sibility are nevertheless shown by more careful considération to be inti- 
mately connected with the theory of prime numbers. We illustrate this 
statement by the following example. 

One of the problems of number theory consists of finding those natural 
numbers that can be decomposed into the sum of the squares of two 
integers (not necessarily different from zéro). 

The rule for the sequence of numbers that are the sum of two squares 
is not immediately clear. From I to 50, for example, it consists of the 
numbers 1, 2, 4, 5, 8, 9, 10, 13, 16, 18, 20, 25, 26, 29, 32, 34, 36, 37, 40, 
41, 45, 49, 50 a sequence which scems quite erratic. The 17th century 
French mathematician Fermât noticed that here everything dépends 
on how the number can be represented as the product of primes, i.e., the 
question is inherently related to the theory of prime numbers. 

Prime numbers, other than p = 2, are odd, so that division by 4 gives 
a remainder equal to 1 (for a prime number of the form 4n + I) or to 
3 (for a prime number of the form 4 n + 3). 

We will consider the question of expressing a given number as the sum 
of two squares under the following three headings. 

1. A prime number p is the sum of two squares if and only if p = 4 n + 1. 

The proof of the fact that a number of the form 4n + 3 cannot be 
expressed as the sum of two squares is almost obvious: The sum of 
the squares of two even numbers is divisible by 4, the sum of the square 
of two odd numbers gives a remainder of 2 when divided by 4, and the 
sum of the squares of an even and an odd number, when divided by 4, 
gives a remainder of 1. 

Let us now prove a preliminary theorem, namely that if p is a prime, 
then (p — 1)! + 1 is divisible by p. The numbers not divisible by p , when 
divided by p give the remainder 1, 2, 3, —, p — 1. We choose an integer r, 

1 </•</>— 1 and multiply r by I, 2, —, p — 1; when we divide the 
Products so constructed by p we obtain, as is not difficult to prove, ail 
these same remainders, but in general in a different order. In particular, 
among these remainders will be the number I, that is to say, forevery r 
one can find an r, such that r • r l = 1 + kp. We note that r = r, only 
if r = 1 or r = p — 1. For if r 2 = 1 + kp, then (/• + 1) (r — 1) is divisible 
by p\ but for numbers 1 < r <^p — 1 this is possible only for r = 1 and 
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r — p — 1. Let us find the remainders on dividing (p — 1)! = 1-2 
•••(/>— 1) by p. In this product, for every factor r, except 1 andp — 1, 
there occurs a corrcsponding r, , distinct from r, such that r ■ r, gives the 
remainder 1. Thus (p — I)! will give a remainder dividing by p which is 
the same as if only the two factors 1 and p — 1 were présent, i.e., it gives 
the remainder p — 1. Thus, (p — I)! + 1 is divisible by p. 

Now let p = An + 1. Further, we write 

(P- D! + I = jl -(P - 2 )(/>-l)j + 1. 

The second expression in braces, when divided by p , will leave the 
remainder (— 1)*»-*/* [(p — l)/2]!. But ( p — l)/2 = 2n is an even 
number, so that in this case [( p — l)/2]!* 4- I is also divisible by p. We 
dénoté by A the remainder on dividing [(p — l)/2]! by p. It is obvious that 
A 1 -F 1 is also divisible by p. 

We consider the expression x — Ay, in which x and y range inde- 
pendently over the number 0, I, —, [Vp]\ (here [x] dénotés the largest 
interger not exceeding x). We thus obtain ([y/p] + 1)* ^ p + 1 numerical 
values for x — Ay, which may be distinct or may in some cases coincide. 
Since the various remainders on dividing by p can only be p( 0, 1, 2, —, 
p — 1), while we here hâve at least p 4- I values for x — Ay, there must 
exist two distinct pairs (x,, y ,) and (x t , y t ) such that x, — Ay, and 
x 2 — Ay t leave the same remainder on dividing by p; i.e., 
(x, — x 2 ) — A(y, — y,) is divisible by p. We set x 0 = x, — x 2 , 
y 0 = y\ - yt • Obviously, I x 0 I < y/p, I y 0 I < Vp■ since A* + 1 is 
divisible by p, it follows that )%(A 2 + I) = (Ay 0 )* + is divisible by p; 
but since x„ — Ay 0 is divisible by p, the number xJ — (Ay 0 ) 2 = (x 0 — Ay„) 
(x 0 + Ay 0 ) is divisible by p. Thus the quantity xJ 4 yl, which is equal 
to (4 - M-Vo)* + ( A y»y + yl ). is divisible by p. But | x 0 | < Vp, 

I y 0 I < Vp- Hence xJ 4 - y\ = 0 or xJ 4 - y\ = p. The first is impossible, 
since the pairs (x, ,y,) and (x t ,>»*) were distinct. Thus a prime number 
of the form An 4- 1 is representable as the sum of two squares. 

2. We turn to the décomposition of an arbitrary integer into the sum 
of two squares. It is easy to establish the identity 

(a 2 4- b 1 ) (c* + d*) = (ac — bd) 2 + (ad + be) 2 . 

This identity shows that the product of two integers that are the sum of 
two squares is again the sum of two squares. Hence the product of any 
powers of prime numbers of the form An + 1 (or which are equal to 2) is 
the sum of two squares. Since multipliying the sum of two squares by a 
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square gives the sum of two squares, any number in which the prime 
factors of the An + 3 occur in even powers is the sum of two squares. 

3. We now show that if a prime number of the form An + 3 enters 
into a number in an odd power, the number cannot be expressed as the 
sum of two squares. The original question will then be completely settled. 

We will consider complex numbers of the form a + bi, where a and b 
are ordinary integers. Such a complex number will be called a complex 
inleger. If an integer N is the sum of two squares N = a 2 + b 2 , then 
N = (a + bi) (a — bi) = a • à (where â dénotés the complex conjugate 
of the number a), i.e., N is factored in the domain of complex integers 
into complex conjugate factors. 

In this domain of complex integers, we may construct a theory of 
divisibility completely analogous to the theory of divisibility in the domain 
of ordinary integers. We will say that the complex integer a is divisible by 
the complex integer /3, if a/j9 is again a complex integer. There exist only 
four complex integers a which divide I, namely 1,-1, », and —». We will 
say that a complex integer a is a prime, if it does not hâve any divisors 
other than 1,-1, i, —i, a, —a, ai, —a/'. But now the problem solved under 
the first heading above will hâve a different meaning; it will now turn out 
that numbers of the form An + I (or equal to 2) which in the previous case 
were prime will cease to be prime in the domain of complex integers, while 
it is easy to prove that primes of the form An -f 3 remain prime. 

For, if p = ajS, then p = âfi and p 2 = aâj8/î. But aâ and /?£are ordinary 
positive integers; and p aâ, since prime numbers of the form An + 3 are 
not the sum of two squares. This means that aâ = 1; thus a can be 
only ±1 or ±i, so that p has no divisors other than the obvious ones. 

For complex integers the theorem on the unique décomposition into 
prime factors still holds. Uniqueness here means, of course, that the order 
of multiplication is ignored and also ail factors of the form 1, —1, i, —i. 

Let N be the sum of two squares, N = aâ. Let p be a prime number of 
the form An -F 3. Let us calculate what power of p appears in the number 
N. From the fact that p remains a prime in the complex domain, it is 
sufficient to calculate what power of p appears in a and in â. But these 
powers are equal, so that p necessarily appears in N to an even power, 
which proves the proposition. 

The discovery that a rich theory of divisibility is possible elsewhere than 
in the domain of whole rational numbers greatly extended the field of 
vision of 19th century mathematicians. The development of these ideas 
called for the création of new general concepts in mathematics, such as, 
for example, rings and ideals. The significance of these concepts at the 
présent time has far outgrown the frame of number theory. 
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XI 


THE THEORY 
OF PROBABILITY 


§1. The Laws of Probability 

The simplest laws of natural science are those that State the conditions 
under which some event of interest to us will either certainly occur or 
certainly not occur; i.e., these conditions may be expressed in one of the 
following two forms: 

1. If a complex (i.e., a set or collection) of conditions S is realized, then 
event A certainly occurs; 

2. If a complex of conditions 5 is realized, then event A cannot occur. 

In the first case the event A, with respect to the complex of conditions 

S, is called a “certain” or “necessary” event, and in the second an 
“impossible” event. For example, under atmospheric pressure and at 
température t between 0° and 100° (the complex of conditions S) water 
necessarily occurs in the liquid state (the event A t is certain) and cannot 
occur in a gaseous or solid state (events A 2 and A 3 are impossible). 

An event A, which under a complex of conditions 5 sometimes occurs 
and sometimes does not occur, is called random with respect to the 
complex of conditions. This raises the question: Does the randomness of 
the event A demonstrate the absence of any law connecting the complex 
of conditions S and the event Al For example, let it be established that 
lamps of a spécifie type, manufactured in a certain factory (condition S) 
sometimes continue to burn more than 2,000 hours (event A), but some¬ 
times burn out and become useless before the expiration of that time. 
May it not still be possible that the results of experiments to see whether 
a given lamp will or will not burn for 2,000 hours will serve to evaluate 
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the production of the factory? Or should we restrict ourselves to indicating 
only the period (say 500 hours) for which in practice ail lamps work 
without fail, and the period (say 10,000 hours) after which in practice ail 
lamps do not work? It is clear that to describe the working life of a lamp 
by an inequality of the form 500 < T < 10,000 is of little help to the 
consumer. He will receive much more valuable information if we tell him 
that in approximately 80% of the cases the lamps work for no less than 
2,000 hours. A still more complété évaluation of the quality of the lamps 

will consist of showing 
for any T the percent 
v(T) of the lamps which 
work for no less than T 
hours, say in the form of 
the graph in figure 1. 

The curve v(7) is found 
in practice by testing with 
a sufficiently large sample 
(100-200) of the lamps. 
Of course, the curve 
found in such a manner is 
of real value onlyinthose 
where it truly represents an actual law governing not only the given sample 
but ail the lamps manufactured with a given quality of material and under 
given technological conditions; that is, only if the same experiments 
conducted with another sample will give approximately the same results 
(i.e., the new curve v(T) will differ little from the curve derived from 
the first sample). In other words, the statistical law expressed by the curves 
v(T) for the various samples is only a reflection of the law of probability 
connecting the useful life of a lamp with the materials and the technological 
conditions of its manufacture. 

This law of probability is given by a function P(T), where P (T) is the 
probability that a single lamp (made under the given conditions) will 
burn no less than T hours. 

The assertion that the event A oceurs under conditions S with a definite 
probability 



Fig. 1. 


P (A/S) = p 


amounts to saying that in a sufficiently long sériés of tests (i.e., realizations 
of the conjplex of conditions S) the frequencies 
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of the occurrence of the event A (where n r is the numberof tests in the rth 
sériés, and p r is the number of tests of this sériés for which event A occurs) 
will be approximately identical with one another and will be close 
to p. 

The assumption of the existence of a constant p = P (A/S) (objectively 
determined by the connection between the complex of conditions S and 
the event A) such that the frequencies v get doser “generally speaking” 
to p as the number of tests increases, is well borne out in practice for a 
wide class of events. Events of this kind are usuaully called random or 
stochastic. 

This, example belongs to the laws of probability for mass production. 
The reality of such laws cannot be doubted, and they form the basis of 
important practical applications in statistical quality control. Of a similar 
kind are the laws of probability for the scattering of missiles, which are 
basic in the theory of gunfire. Since this is historically one of the earliest 
applications of the theory of probability to technical problems, we will 
return below to some simple problems in the theory of gunfire. 

What was said above the “closeness” of the frequency v to the prob¬ 
ability p for a large number n of tests is somewhat vague; we said nothing 
about how small the différence v — p may be for any n. The degree of 
closeness of v to p is estimated in §3. lt is interesting to note that a certain 
indefiniteness in this question is quite unavoidable. The very statement 
itself that v and p are close to each other has only a probabilistic character, 
as becomes clear if we try to make the whole situation précisé. 


§2. The Axioms and Basic Formulas of the Elementary Theory 
of Probability 

Since it cannot be doubted that statistical laws are of great importance, 
we turn to the question of methods of studying them. First of ail one thinks 
cf the possibility of proceeding in a purely empirical way. Since a law of 
probability exhibits itself only in mass processes, it is natural to imagine 
that in order to discover the law we must conduct a mass experiment. 

Such an idea, however, is only partly right. As soon as we hâve 
established certain laws of probability by experiment, we may proceed to 
deduce from them new laws of probability by logical means or by computa¬ 
tion, under certain general assumptions. Before showing how this is done, 
we must enumerate certain basic définitions and formulas of the theory 
of probability. 

From the représentation of probability as the standard value of the 
frequency v — m/n, where 0 < m < n, and thus 0 ^ »■ ^ 1 , it follows that 
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the probability P(/4) of any event A must be assumed to lie between zéro 
and one* 

0 < P(^) < 1. 

Two events are said to be mutually exclusive if they cannot both occur 
(under the complex of conditions S). For example, in throwing a die, the 
the occurrence of an even number of spots and of a three are mutually 
exclusive. An event A is called the union of events A, and A 2 if it consists 
of the occurrence of at least one of the events A, , A 2 . For example, in 
throwing a die, the event A, consisting of rolling 1, 2, or 3, is the union 
of the events A , and A 2 , where A t consists of rolling 1 or 2 and A 2 consists 
of rolling 2 or 3. It is easy to see that for the number of occurrences w,, 
m 2 , and m of two mutually exclusive events A t and A 2 and their union 
A = A, u A 2 , we hâve the équation m = m, -f m 2 , or for the corre- 
sponding frequencies v = v x + v t . 

This leads naturally to the following axiom for the addition of probabili¬ 
tés: 

PM. ^ = PM.) + PM*), (2) 

if the events A , and A 2 are mutually exclusive and A, u A 2 dénotés their 
union. 

Further, for an event U which is certain, we naturally take 

P (U) = 1. (3) 

The whole mathematical theory of probability is constructed on the 
basis of simple axioms of the type (I), (2), and (3). From the point of 
view of pure mathematics, probability is a numerical function of “events,” 
with a number of properties determined by axioms. The properties of 
probability, expressed by formulas (1), (2), and (3), serve as a sufficient 
basis for the construction of what is called the elementary theory of prob¬ 
ability, if we do not insist on including in the axiomatization the concepts 
of an event itself, the union of events, and their intersection, as defined 
later. For the beginner it is more useful to confine himself to an intuitive 
understanding of the terms “event” and “probability,” but to realize 
that although the meaning of these terms in practical life cannot be 
completely formalized, still this fact does not affect the complété formai 
précision of an axiomatized, purely mathematical présentation of the 
theory of probability. 

The union of any given number of events A t , A 2 , —, A, is defined as 
the event A consisting of the occurrence of at least one of these events. 


* For brevity we now change P(A/S) to P (A). 
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Fromtheaxiom of addition, we easily obtain for any number of pairwise 
mutually exclusive events A lt A ît —, A, and their union A, 

P(/l) = P(^,) + P^,) + - + P (A.) 

(the so-called theorem of the addition of probabilities). 

If the union of these events is an event that is certain (i.e., under the 
complex of conditions 5 one of the events A k must occur), then 

P (AO + P (AJ + - + P(A,) = 1. 

In this case the System of events A ,, —, A, is called a complété System of 
events. 

We now consider two events A, and fl, which, generally speaking, are 
not mutually exclusive. The event C is the intersection of the events A and 
fl, written C = AB, if the event C consists of the occurrence of both 
A and fl.* 

For example, if the event A consists of obtaining an even number in the 
throw of a die and fl consists of obtaining a multiple of three, then the 
event C consists of obtaining a six. 

In a large number n of repeated trials, let the event A occur m times and 
the event fl occur / times, in k of which fl occurs together with the event A. 
The quotient k/m is called the conditional frequency of the event S under 
the condition A. The frequencies k/m, m/n, and k/n are connected by the 
formula 

k km 
m n' n 


which naturally gives rise to the following définition: 

The conditional probability P (B/A) of the event fl under the condition 
A is the quotient 


P {B/A) = 


P (AB) 
P(/43 ‘ 


Here it is assumed, of course, that P(A) ^ 0. 

If the events A and fl are in no way essentially connected with each 
other, then it is natural to assume that event fl will not appear more often, 
or less often, when A has occurred than when A has not occurred, i.e., 
that approximately k/m ~ l/n or 

k km I m 

— =->>*/ - . 

n m n n n 

* Similarly, the intersection C of any number of events A,, A,, —, A, consists 
of the occurrence of ail the given events. 
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In this last approximate équation m/n = v A is the frequency of the event A, 
and lin = v B is the frequency of the event B and finally k/n = v AB is the 
frequency of the intersection of the events A and B. 

We see that these frequencies are connected by the relation 

V AB ~ V A V B ■ 

For the probabilities of the events A, B and AB, it is therefore natural to 
accept the corresponding exact équation 

P(AB) = P(/l) • P(B). (4) 

Equation (4) serves to define the independence of two events A and B. 

Similarly, we may define the independence of any number of events. 
Also, we may give a définition of the independence of any number of 
experiments, which means, roughiy speaking, that the outcome of any 
part of the experiments do not dépend on the outcome of the rest.* 

We now compute the probability P k of precisely k occurrences of a 
certain event A in n independent tests, in each one of which the probability 
p of the occurrence of this event is the same. We dénoté by A the event 
that event A does not occur. It is obvious that 

P(A) = I - P(,4) = I - p. 

From the définition of the independence of experiments it is easy to 
see that the probability of any spécifie sequence consisting of k occurrences 
of A and n — k nonoccurrences of A is equal to 

p k (\ - p)”-*. (5) 

Thus, for example, for n = 5 and k = 2 the probability of getting the 
sequence AAaAÀ will be p{\ — p)p( 1 — p) (1 — p) = p\\ — p) 3 , 

By the theorem on the addition of probabilities, P k will be equal to the 
sum of the probabilities of ali sequences with k occurrences and n — k 
nonoccurrences of the event A, i.e., P k will be equal from (5) to the product 
of the number of such sequences by p k ( 1 — p) n ~ k . The number of such 


* A more exact meaning of Independent experiments is the following. We divide the 
n experiments in any way into two groups and let the event A consist of the resuit that 
ail the experiments of the first group hâve certain preassigned outcomes, and the event 
B that the experiments of the second group hâve preassigned outcomes. The experiments 
are called independent (as a collection) if for arbitrary décomposition into two groups 
and arbitrarily preassigned outcomes the events A and B are independent in the sense 
of (4). 

We will return in §4 to a considération of the objective meaning in the actual world 
of the independence of events. 
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sequences is obviously equal to the number of combinations of n things 
taken k at a time, since the k positive outcomes may occupy any k places 
in the sequence of n trials. 

Finally we get 

P k = CM 1 - P)-* (k = 0, 1,2, -, n) (6) 

(which is called a binomial distribution). 

In order to see how the définitions and formulas are applied, we consider 
an example that arises in the theory of gunfire. 

Let five hits be sufficient for the destruction of the target. What interests 
us is the question whether we hâve the right to assume that 40 shots will 
insure the necessary five hits. A purely empirical solution of the problem 
would proceed as follows. For given dimensions of the target and for a 
given range, we carry out a large number (say 200) of firings, each con- 
sisting of 40 shots, and we détermine how many of these firings produce 
at least five hits. If this resuit is achieved, for example, by 195 firings out 
of the 200, then the probability P is approximately equal to 


p -m~ °- 975 - 


If we proceed in this purely empirical way, we will use up 8,000 shells 
to solve a simple spécial problem. In practice, of course, no one proceeds 
in such a way. Instead, we begin the investigation by assuming that the 
scattering of the shells for a given range is independent of the size of the 
target. It turns out that the longitudinal and latéral déviations, from the 
mean point of landing of the shells, follow a law with respect to the 
frequency of déviations of various sizes that is illustrated in figure 2. 


2% 7% 16%, 25% 25% !6% 7% 2% 

-4 B -3B -2B -B O B 2B 3B 4B 
Fig. 2. 

The letter B here dénotés what is called the probable déviation. The 
probable déviation, generally speaking, is different for longitudinal and 
for latéral déviations and increases with increasing range. The probable 
déviations for different ranges for each type of gun and of shell are found 
empirically in firing practice on an artillery range. But the subséquent 
solution of ail possible spécial problems of the kind described is carried 
out by calculations only. 

For simplicity, we assume that the target has the form of a rectangle, 
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one side of which is directed along the line of fire and has a length of two 
probable longitudinal déviations, while the other side is perpendicular to 
the line of fire and is equal in length to two probable latéral déviations. 
We assume further that the range has already been well established, so 
that the mean trajectory of the shells passes through its center (figure 3). 



We also assume that the latéral and longitudinal déviations are inde- 
pendent.* Then for a given shell to fall on the target, it is necessary and 
sufficient that its longitudinal and latéral déviations do not exceed the 
corresponding probable déviations. From figure 2 each of these events 
will be observed for about 50% of the shells fired, i.e., with probability £. 
The intersection of the two events will occur for about 25% of the shells 
fired; i.e., the probability that a spécifie shell will hit the target will be 
equal to 

I I 1 

p ~ 2'2 “ 4’ 

and the probability of a miss for a single shell will be 



Assuming that hits by the individual shells represent independent 
events, and applying the binomial formula (6), we find that the probability 
for getting exactly k hits in 40 shots will be 


P k = C*J> V 0 -* = 


40 • 39 - (39 - k) 
1-2 -k 


(i)‘© 


3\«-* 


What concerns us is the probability of getting no less than five hits, and 
this is now expressed by the formula 


*0 

p = X p * ■ 


k-S 


This assumption of independence is borne out by expérience. 
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Butitissimplertocompute this probability from the formula P = I — Q, 
where 

Q = ’£p* 

i-0 

is the probability of getting less than five hits. 

We may calculate that 

z3>. *° 

p° = ïï) ~ 0 00001 ' 

/3\ 3 * I 

/> 1 = 40gj 0.00013, 

-0.00087, 

^.- 4o ” 3 3 v 37 r(y~°°" 3 ' 

so that 

Q = 0.016, P = 0.984. 

The probability P so obtained is somewhat doser to certainty than is 
usually taken to be suffident in the theory of gunfire. Most often it is 
considered permissible to détermine the number of shells needed to 
guarantee the resuit with probability 0.95. 

The previous example is somewhat schematized, but it shows in suffident 
detail the practical importance of probabilitycalculations. Afterestablishing 
by experiment the dependence of the probable déviations on the range 
(for which we did not need to fire a large number of shells), we were then 
able to obtain, by simple calculation, the answers to questions of the most 
diverse kind. The situation is the same in ail other domains where the 
collective influence of a large number of random factors leads to a statistical 
law. Direct examination of the mass of observations makes clear only the 
the very simplest statistical laws; it uncovers only a few of the basic prob¬ 
abilités involved. But then, by means of the laws of the theory of probabil¬ 
ity, we use these simplest probabilités to compute the probabilités of 
more complicated occurrences and deduce the statistical laws that govern 
them 

Sometimes we succeed in completely avoiding massive statistical 
material, since the probabilités may be defined by sufficiently convincing 
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considérations of symmetry. For example, the traditional conclusion that 
a die, i.e., a cube made of a homogeneous material will fall, when thrown 
to a sufficient height, with equal probability on each of its faces was 
reached long before there was any systematic accumulation of data to 
verify it by observation. Systematic experiments of this kind hâve been 
carried out in the last three centuries, chiefly by authors of textbooks in 
the theory of probability, at a time when the theory of probability was 
already a well-developed science. The results of these experiments were 
satisfactory, but the question of extending them to analogous cases 
scarcely arouses interest. For example, as far as we know, no one has 
carried out sufficiently extensive experiments in tossing homogeneous dice 
with twelve sides. But there is no doubt that if we were to make 
12,000 such tosses, the twelve-sided die would show each of its faces 
approximately a thousand times. 

The basic probabilités derived from arguments of symmetry or homo- 
geneity also play a large rôle in many serious scientific problems, for 
example in ail problems of collision or near approach of molécules in 
random motion in a gas; another case where the successes hâve been 
equally great is the motion of stars in a galaxy. Of course, in these more 
délicate cases we prefer tocheck our theoretical assumptionsby comparison 
with observation or experiment. 


§3. The Law of Large Numbers and Limit Theorems 

It is completely natural to wish for greater quantitative précision in the 
proposition that in a “long” sériés of tests the frequency of an occurrence 
cornes “close” to its probability. But here we must form a clear notion 
of the délicate nature of the problem. In the most typical cases in the 
theory of probability, the situation is such that in an arbitrarily long sériés 
of tests it remains theoretically possible that we may obtain either of the 
two extremes for the value of the frequency 

*= n -=\ and ^ = - = 0. 
n n n n 

Thus, whatever may be the number of tests n, it is impossible to assert 
with complété certainty that we will hâve, say, the inequality 


For example, if the event A is the rolling of a six with a die, then in n 
trials, the probability that we will turn up a six on ail n trials is (£)" > 0, 
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in other words, with probability (-£)" we will obtain a frequency of rolling 
a six which is equal to one; and with probability (1 — ^) n > 0 a six will 
not corne up at ail, i.e., the frequency of rolling a six will be equal to zéro. 

In ail similar problems any nontrivial estimate of the closeness of the 
frequency to the probability cannot be made with complété certainty, 
but only with some probability less than one. For example, it may be 
shown that in independent tests,* with constant probability p of the 
occurrence of an event in each test the inequality 

| <0.02 (7) 

for the frequency p\n will be satisfied, for n = 10,000 (and any p), with 
probability 

P > 0.9999. (8) 

Here we wish first of ail to emphasize that in this formulation the quanti¬ 
tative estimate of the closeness of the frequency p/n to the probability p 
involves the introduction of a new probability P. 

The practical meaning of the estimate (8) is this: If we carry out TV sets 
of n tests each, and count the M sets in which inequality (7) is satisfied, 
then for sufficiently large N we will hâve approximately 

^ P > 0.9999. (9) 

But if we wish to define the relation (9) more precisely, either with 
respect to the degree of closeness of M/N to P, or with respect to the 
confidence with which we may assert that (9) will be verified, then we must 
hâve recourse to general considérations of the kind introduced previously 
in discussing what is meant by the closeness of pfn and p. Such considéra¬ 
tions may be repeated as often as we like, but it is clear that this procedure 
will never allow us to be free of the necessity, at the last stage, of referring 
to probabilities in the primitive imprécise sense of this term. 

It would be quite wrong to think that diffïculties of this kind are peculiar 
in some way to the theory of probability. In a mathematical investigation 
of actual events, we always make a model of them. The discrepancies 
between the actual course of events and the theoretical model can, in its 
turn, be made the subject of mathematical investigation. But for these 
discrepancies we must construct a model that we will use without formai 
mathematical analysis of the discrepancies which again would arise in it 
in actual experiment. 

* The proof of the estimate (8) is discussed later in this section. 
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We note, moreover, that in an actual application of the estimate* 


P 




<0.02 > 0.9999 


( 10 ) 


to one sériés of n tests we are already depending on certain considérations 
of symmetry: inequality (10) shows that for a very large number N of 
sériés of tests, relation (7) will be satisfied in no less than 99.99% of the 
cases; now it is natural to expect with great confidence that inequality 
(7) will apply in particular to that one of the sequence of n tests which is 
of interest to us, but we may expect this only if we hâve some reason for 
assuming that the position of this sequence among the others is a regular 
one, that is, that it has no spécial features. 

The probabilités that we may décidé to neglect are different in different 
practical situations. We noted earlier that our preliminary calculations for 
the expenditure of shells necessary to produce a given resuit meet the 
standard that the problem is to be solvedwith probability 0.95, i.e., that 
the neglected probabilities do not exceed 0.05. This standard is explained 
by the fact that if we were to make calculations neglecting a probability 
of only 0.01, let us say, we would necessarily require a much greater 
expenditure of shells, so that in practice we would conclude that the 
task could not be carried out in the time at our disposai, or with the 
given supply of shells. 

In scientific investigations also, we are sometimes restricted to statistical 
methods calculated on the basis of neglecting probabilities of 0.05, 
although this practice should be adopted only in cases where the accumul¬ 
ation of more extensive data is very difficult. As an example of such a 
method let us consider the following problem. We assume that under 
spécifie conditions the customary medicine for treating a certain illness 
gives positive results 50% of the time, i.e., with probability 0.5. A new 
préparation is proposed, and to test its advantages we plan to use it in 
ten cases, chosen without bias from among the patients suffering from 
the illness. Here we agréé that the advantage of the new préparation will 
be considered as proved if it gives a positive resuit in no less than eight 
cases out of the ten. It is easy to calculate that such a procedure involves 
the neglect of probabilities of the order of 0.05 of getting a wrong resuit, 
i.e., of indicating an advantage for the new préparation when in fact it is 
only equally effective or even worse than the old. For if in each of the ten 
experiments, the probability of a positive outcome is equal to p, then the 


* This is the accepted notation for estimate (8) of the probability of inequality (7). 
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probability of obtaining in ten experiments 10, 9, or 8 positive outcomes, 
is equal respectively to 

Pi o = P 10 ’ P . = 10p»(l - p), P g = 45p 8 ( 1 - pf. 

For the case p = \ the sum of these is 

P=P.. + ^. + P. = î |j~0.05. 

In this way, under the assumption that in fact the new préparation is 
exactly as effective as the old, we risk with probability of order 0.05 the 
error of finding that the new préparation is better than the old. To reduce 
this probability to about 001, without increasing the number of experi¬ 
ments n = 10, we will need to agréé that the advantage of the new prépara¬ 
tion is proved if it gives a positive resuit in no less than nine cases out of 
the ten. If this requirement seems to severe to the advocates of the new 
préparation, it will be necessary to make the number of experiments 
considerably larger than 10. For example, for n = 100, if we agréé that 
the advantage of the new préparation is proved for p > 65, then the 
probability of error will only be P s% 0.0015. 

For serious scientific investigations a standard of 0.05 is clearly 
insufficient; but even in such academie and circumstantial matters as the 
treatment of astronomical observations, it is customary to neglect prob¬ 
abilités of error of 0.001 or 0.003. On the other hand, some of the scientific 
results based on the laws of probability are considerably more reliable 
even than that; i.e., they involve the neglect of smaller probabilities. We 
will return to this question later. 

In the previous examples, we hâve made use of particular cases of the 
binomial formula (6) 

P m = 0«( 1 -p)- m 

for the probability of getting exactly m positive results in n independent 
trials, in each one of which a positive outcome has probability p. Let us 
consider, by means of this formula, the question raised at the beginning 
of this section concerning the probability 


P = 




( 11 ) 


where p is the actual number of positive results.* Obviously, this prob- 


* Here n takes the values m = 0,1, —, n, with probability P„; i.e., 

PO i = m) = P m . 
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abiJity may be written as the sum of those P m for which m satisfies the 
inequality 



< «, 


( 12 ) 


i.e., in the form 



(13) 


where m, is the smallest of the values of m satisfying inequality (12), and 
m 2 is the largest. 

Formula (13) for fairly large n is hardly convenient for immédiate 
calculation, a fact which explains the great importance of the asymptotic 
formula discovered by de Moivre for p — £ and by Laplace for general p. 
This formula allows us to find P m very simply and to study its behavior 
for large n. The formula in question is 


p ,_, 1 f -(m-fip)*/ïnWI-p) ' 

V2nnp(\ — p) 


(14) 


If p is not too close to zéro or one, it is sufficiently exact even for n of the 
order of 100. If we set 


m — nd 
Vnp(\ '- p) 


05) 


then formula (14) becomes 
P m ~ 


I 

Vînnp(\ — p) 


-«•/* . 


(16) 


From (13) and (16) one may dérivé an approximate représentation of the 
probability (11) 

P -^ C e~"/*dt = F(T), (17) 

V2 TT J -T 


where 

r= ai - p ) • 


(18) 


The différence between the left and right sides of (17) for fixed p, different 
from zéro or one, approaches zéro uniformly with respect to «, as n-* oo. 
For the function F(T) detailed tables hâve been constructed. Here is a 
small excerpt from them 

T\ 1 | 2| 3 | 4 

F| 0.68269 | 0.95450 | 0.99730 | 0.99993' 
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For T -*• oo the values of the function F{T) converge to one. 
From formula (17) we dérivé an estimate of the probability 


P 




<0.02 


for n = 10,000. Since 


we hâve 


r = 


p * e | 


VR\ -p) ’ 

( 2 \ 

' VpÔ -p) ' 


Since the function FÇT) is monotonie increasing with increasing T, it 
follows for an estimate of P from the following which is independent of 
p, we must take the smallest possible (for the various p) value of T. Such 
a smallest value occurs for p = £ and is equal to 4. Thus, approximately 

P F(A) = 0.99993. (19) 

In equality (19) no account is taken of the error arising from the 
approximate character of formula (17). By estimating the error involved 
here, we may show that in any case P > 0.9999. 

In connection with this example of the application of formula (17), 
one should note that the estimâtes of the remainder term in formula (17) 
given in theoretical works on the theory of probability were for a long time 
unsatisfactory. Thus the applications of (17) and similar formulas to 
calculations based on small values of n, or with probabilities p very close 
to 0 or 1 (such probabilities are frequently of particular importance) were 
often based on experimental vérification only of results of this kind for a 
restricted num'oer of examples, and not on any valid estimâtes of the 
possible error. Also, it was shown by more detailed investigation that in 
many important practical cases the asymptotic formulas introduced 
previously require not only an estimate of the remainder term but also 
certain further refinements (without which the remainder term would be 
too large). In both directions the most complété results are due to S. N. 
Bernlteln. 

Relations (11), (17), and (18) may be rewritten in the form 


P 



P 




( 20 ) 


For sufficiently large t the right side of formula (20), which does not 
contain n, is arbitrarily close to one, i.e., to the value of the probability 
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which gives complété certainty. We see, in this way, that, as a rule, the 
déviation of the frequency p./n from the probability p is of order 1 /y/h. 
Such a proportionality between the exactness of a law of probability and 
the square root of the number of observations is typical for many other 
questions. Sometimes it is even said in popuiar simplifications that “the 
law of the square root of n” is the basic law of the theory of probability. 
Complété précision concerning this idea was attained through the intro¬ 
duction and systematic use by the great Russian mathematician P. L. 
CebySev of the concepts of “mathematical expectation” and “variance” 
for sums and arithmetic means of “random variables.” 

A random variable is the name given to a quantity which under given 
conditions S may take various values with spécifie probabilities. For us it 
is sufficient to consider random variables that may take on only a finite 
number of different values. To give the probability distribution , as it is 
called, of such a random variable it is sufficient to State its possible 
values x,, x, , •••, x„ and the probabilities 

P{ = PU = x r ). 

The sum of these probabilities for ail possible values of the variable f is 
always equal to one: 

2 ) Pr = '• 

r-I 

The number investigated above of positive outcomes in n experiments 
may serve as an example of a random variable. 

The mathematical expectation of the variable £ is the expression. 


M(£) = '%PrX T , 

r-l 

and the variance of £ is the mathematical expectation of the square of the 
déviation f — M(f), i.e., the expression 

D(£) = X P r [x r - M(f)]’. 

r-l 

The square root of the variance 

oÇ= VD(|j 

is called the standard déviation (of the variable from its mathematical 
expectation M(£)). 
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At the basis of the simplest applications of variance and standard 
déviation lies the famous inequality of CebySev 

P{lf-Mtf)| </*«}> 1-1. (21) 

It shows that déviations of Ç from M(f) significantly greater than a ( are 


As for the sum of random variables 

f = £ ,u + e*> + - + f«-», 

their mathematical expectations always satisfy the équation 

M(£) = M(£">) + M(£<*>) + - + (22) 

But the analogous équation for the variance 

D(£) = D(f ») + D(£<*>) + - + D(f<">) (23) 

is true only under certain restrictions. For the validity of équation (23) it is 
sufficient, for example, that the variables 1 and f 1 *' with different indices 
not be “correlated" with one another, i.e., that for ijtj the équation* 

- M(f«>)] [f»> - M(f«»)]} = 0 (24) 

be satisfied. 

In particular, équation (24) holds if the variables and are 
independent of each other.t Consequently, for mutually independent terms 
équation (23) always holds. For the arithmetic mean 

£ = i(£ a '+ 

n 

it follows from (23) that 

D(0 = -1 [D(£'») + D(|«*') + - + D(£'"')]. (25) 


* The corrélation coefficient between the variables f 1 *' and is the expression 

A = - . 

o a 
£<<) 

If o { (0 > 0 and o { ul > 0, then condition (24) is équivalent to saying that R = 0. 

The corrélation coefficient R characterizes the degree of dependence between random 
variables. | R | < 1 always, and R = ± I only for a linear relationship 

V = ai + b (ajé 0). 

For independent variables R = 0. 

t The indcpendencc of two random variables f and y, which may assume, respectively, 
the values x, , x ,, —, x m and y ,, y ,, —, y n , is defined to mean that for any / and y 
the events A, = (( = x:,} and B, = {ij = y,) are independent in the sense of the 
définition given in §2. 
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We now assume that for each of these terms the variance does not 
exceed a certain constant 

D(£ ,n ) < a. 

Th en from (25) 

C ' 1 

“««T- 

and from CebySev’s inequality for any t 

Pjl£-M(OI > 1 - —. (26) 

Inequality (26) expresses what is called the law of large numbers, in the 
form established by CebySev: If the variables are mutually independent 
and hâve bounded variance, then for increasing n the arithmetic mean 
£ will deviate more and more rarely from the mathematical expectation 
M(£). 

More precisely, the sequence of variables 

P», f*», - 

is said to obey the law of large numbers if for the corresponding arithmetic 
means £ and for any constant t > 0 

P{ ! £ - M(£)| ^ <} - 1 (27) 

forn —► oo. 

In order to pass from inequality (26) to the limiting relation (27) it is 
sufficient to put 



A large number of investigations of A. A. Markov, S. N. BerSteln, 
A. Ja. HinCin, and others were devoted to the question of widening as 
far as possible the conditions under which the limit relation (27) is valid, 
i.e., the conditions for the validity of the law of large numbers. These 
investigations are of basic theoretical significance, but still more important 
is an exact study of the probability distribution for the variable £ — M(£). 

One of the greatest services rendered by the classical Russian school of 
mathematicians to the theory of probability is the establishment of the 
fact that under very wide conditions the équation 

P{',*c < £ - M(£) < t t a c } ~ -L f e-'V* dt (28) 

v27T J I, 

is asymptotically valid (i.e., with greater and greater exactness as n 
increases beyond ail bounds). 
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CebySev gave an almost complété proof of this formula for the case of 
independent and bounded terms. Markov closed a gap in CebySev’s 
argument and widened the conditions of applicability of formula (28). 
Still more general conditions were given by Ljapunov. The applicability 
of formula (28). Still more general conditions were given by Ljapunov. 
The applicability of formula (28) to the sum of mutually dépendent 
terms was studied with particular completeness by S. N. BernSteln. 

Formula (28) embraces such a large number of particular cases that 
it has long been called the central limit theorem in the theory of probability. 
Even though it has been shown lateiy to be included in a sériés of more 
general laws its value can scarcely be overrated even at the présent time. 

If the terms are independent and their variances are ail the same, and 
are equal to 

D(£"') = o*. 

then it is convenient, using relation (25), to put formula (28) into the form 

3 = 1 ~wS.r' nd '- m 

Let us show that relation (29) contains the solution of the problem, 
considered earlier, of evaluating the déviation of the frequency n/n from 
the probability p. For this we introduce therandom variables £<'>,defined 
as follows: 

en) _ l h e 'th test has a négative outcome, 
ç ( 1, if the /th test has a positive outcome. 

It is easy to verify that then 

#* = + ** + - + ;=t. 

n 

M(É">) = P . D(f *») = p( 1 - p), M(0 = p, 
and formula (29) gives 

- y*- 

which for fj = —/,/* = / leads again to formula (20). 

§4. Further Remarks on the Basic Concepts of the Theory of Probability 

In speaking of random events, which hâve the property that their 
frequencies tend to become stable, i.e., in a long sequence of experiments 



248 


XI. THE THEORY O F P RO B A BILITY 


repeated under fixed conditions, their frequenries are grouped around 
some standard level, called their probability P(4/S), we were guilty, in §1, 
of a certain vagueness in our formulations, in two respects. In the first 
place, we did not indicate how long the sequence of experiments n T must 
be in order to exhibit beyond ail doubt the existence of the suppose 
stability; in other words, we did not say what déviations of the frequencies 
p r /n r from one another or from their standard level p were allowable for 
sequences of trials /i lt /!*,•••, n, of given length. This inexactness in the 
first stage of formulating the concepts of a new science is unavoidable. 
It is no greater than the well-known vagueness surrounding the simplest 
géométrie concepts of point and straight line and their physical meaning. 
This aspect of the matter was made clear in §3. 

More fundamental, however, is the second lack of clearness concealed in 
our formulations; it concerns the manner of forming the sequences of 
trials in which we are to examine the stability of the frequency of occur¬ 
rence of the event A. 

As stated earlier, we are led to statistical and probabilistic methods of 
investigation in those cases in which an exact spécifie prédiction of the 
course of events is impossible. But if we wish to create in some artificial 
way a sequence of events that will be, as far as possible, purely random, 
then we must take spécial care that there shall be no methods available 
for determining in advance those cases in which A is likely to occur with 
more than normal frequency. 

Such précautions are taken, for example, in the organization of 
government lotteries. If in a given lottery there are to be M winning tickets 
in a drawing of H tickets, then the probability of winning for an individual 
ticket is equal to p = M/N. This means that in whatever manner we 
select, in advance of the drawing, a sufficiently large set of n tickets, we 
can be practically certain that the ratio p/n of the number p of winning 
tickets in the chosen set to the whole number n of tickets in this set will be 
close to p. For example, people who prefer tickets labeled with an even 
number will not hâve any systematic advantage over those who prefer 
tickets labeled with odd numbers, and in exactly the same way there 
will be no advantage in proceeding on the principle, say, that it is always 
better to buy tickets with numbers having exactly three prime factors, or 
tickets whose numbers are close to those that were winners in the 
preceding lottery, etc. 

Similarly, when we are firing a well-constructed gun of a given type, 
with a well-trained crew and with shells that hâve been subjected to a 
standard quality control, the déviation from the mean position of the 
points of impact of the shells will be less than the previously determined 
probable déviation B in approximately half the cases. This fraction remains 
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the same in a sériés of successive trials, and also in case we count separately 
the number of déviations that are less than B for even-numbered shots 
(in the order of firing) or for odd-numbered. But it is completely possible 
that if we were to make a sélection of particularly homogeneous shells 
(with respect to weight, etc.), the scattering would be considerably 
decreased, i.e., we would hâve a sequence of firings for which the fraction 
of the déviations which are greater than the standard B would be consider¬ 
ably less than a half. 

Thus, to say that an event A is “random” or “stochastic” and to assign 
it a definite probability 


P = HA/S) 

is possible only when we hâve already determined the class of allowable 
ways of setting up the sériés of experiments. The nature of this class will 
be assumed to be included in the conditions S. 

For given conditions S the properties of the event A of being random 
and of having the probability p = P (A/S) express the objective character 
of the connection between the condition S and the event A. In other words, 
there exists no event which is absolutely random, an event is random or is 
predetermined depending on the connection in which it is considered, but 
under spécifie conditions an event may be random in a completely non- 
subjective sense, i.e., independently of the State of knowledge of any 
observer. If we imagine an observer who can master ail the detailed 
distinctive properties and particular circumstances of the flight of shells, 
and can thus predict for each one of them the déviation from the mean 
trajectory, his presence would still not prevent the shells from scattering 
in accordance with the laws of the theory of probability, provided, of 
course, that the shooting was done in the usual manner, and not according 
to instructions from our imaginary observer. 

In this connection we note that the formation of a sériés of the kind 
discussed earlier, in which there is a tendency for the frequencies to become 
constant in the sense of being grouped around a normal value, namely 
the probability, proceeds in the actual world in a manner completely 
independent of our intervention. For example, it is precisely by virtue of 
the random character of the motion of the molécules in a gas that the 
number of molécules which, even in a very small interval of time, strike 
an arbitrarily preassigned small section of the wall of the container (or of 
the surface of bodies situated in the gas) proves to be proportional with 
very great exaetness to the area of this small pièce of the wall and to the 
length of the interval of time. Déviations from this proportionality in 
cases where the number of hits is not large also follow the laws of the 
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theory of probability and produce phenomena of the type of Brownian 
motion, of which more will be said later. 

We turn now to the objective meaning of the concept of independence. 
We recall that the conditional probability of an event A under the condition 
B is defined by the formula 

P(AIB) = ^Hto" (30) 

P{B) 

We also recall that events A and B are called independent if, as in (4), 

P (AB) = P(4)P(5). 

From the independence of the events A and B and the fact that P(Æ) > 0 
it follows that 

P (A/B) = P(4). 

Ail the theorems of the mathematical theory of probability that deal 
with independent events apply to any events satisfying the condition (4), 
or to its generalization to the case of the mutual independence of several 
events. These theorems will be of little interest, however, if this définition 
bears no relation to the properties of objective events which are indepen¬ 
dent in the causal sense. 

It is known, for example, that the probability of giving birth to a boy is, 
with suflîcient stability, P (A) = 22/43. If B dénotés the condition that the 
birth occur on a day of the conjunction of Jupiter with Mars, then under 
the assumption that the position of the planets does not influence the fate 
of individuals, the conditional probability P(-4/£) has the same value: 
P (A/B) = 22/43; i.e., the actual calculation of the frequency of births 
of boys under such spécial astrological conditions would give just the same 
frequency 22/43. Although such a calculation has probably never been 
carried out on a sufficiently large scale, still there is no reason to doubt 
what the resuit would be. 

We give this example, from a somewhat outmoded subject, in order to 
show that the development of human knowledge consists not only in 
establishing valid relations among phenomena, but also in refuting 
imagined relations, i.e., in establishing in relevant cases the thesis of the 
independence of any two sets of events. This unmasking of the meaningless 
attempts of the astrologers to connect two sets of events that are not in 
fact connected is one of the classic examples. 

Naturally, in dealing with the concept of independence, we must not 
proceed in too absolute a fashion. For example, from the law of universal 
graviation, it is an undoubted fact that the motions of the moons of Jupiter 
hâve a certain effect, say, on the flight of an artillery shell. But it is also 
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obvious that in practice this influence may be ignored. From the philo- 
sophical point of view, we may perhaps, in a given concrète situation, 
speak more properly not of the independence but of the insignifiance of 
the dependence of certain events. However that may be, the independence 
of events in the cited concrète and relative sense of this term in no way 
contradicts the principle of the universal interconnection of ail phenomena; 
it serves only as a necessary supplément to this principle. 

The computation of probabilités from formulas derived by assuming 
the independence of certain events is still of practical interest in cases 
where the events were originally independent but became interdependent 
as a resuit of the events themselves. For example, one may compute 
probabilities for the collision of particles of cosmic radiation with particles 
of the medium penetrated by the radiation, on the assumption that the 
motion of the particles of the medium, up to the time of the appearance 
near them of a radpidly moving particle of cosmic radiation, proceeds 
independently of the motion of the cosmic particle. One may compute 
the probability that a hostile bullet will strike the blade of a rotating 
propeller, on the assumption that the position of the blade with respect 
to the axis of rotation does not dépend on the trajectory of the bullet, a 
supposition that will of course be wrong with respect to the bullets of the 
aviator himself, since they are fired between the blades of the rotating 
propeller. The number of such examples may be extended without limit. 

It may even be said that wherever probabilistic laws turn up in any 
clear-cut way we are dealing with the influence of a large number of 
factors that, if not entirely independent of one another, are interconnected 
only in some weak sense. 

This does not at ail mean that we should uncritically introduce assump- 
tions of independence. On the contrary, it leads us, in the first place, to 
be particularly careful in the choice of criteria for testing hypothèses of 
independence, and second, to be very careful in investigating the borderline 
cases where dependence between the facts must be assumed but is of such 
a kind as to introduce complications into the relevant laws of probability. 
We noted earlier that the classical Russian school of the theory of prob¬ 
ability has carried out far-reaching investigations in this direction. 

To bring to an end our discussion of the concept of independence, we 
note that, just as with the définition of independence of two events given 
in formula (4), the formai définition of the independence of several random 
variables is considerably broader than the concept of independence in 
the practical world, i.e., the absence of causal connection. 

Let us assume, for example, that the point f falls in the interval [0, 1] in 
such a manner for 

0 sg<i 1 
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the probability that it belongs to the segment [a, b] is equal to the length 
of this segment b — a. It is easy to prove that in the expansion 


5 10 100 1000 


of the abscissa of the point f in a décimal fraction, the digits a k will be 
mutually independent, although they are interconnected by the way they 
are produced.* (From this fact follow many theoretical results, some of 
which are of practical interest.) 

Such flexibility in the formai définition of independence should not be 
considered as a blemish. On the contrary it merely extends the domain of 
applicability of theorems established for one or another assumption of 
independence. These theorems are equally applicable in cases where the 
independence is postulated on the basis of practical considérations and in 
cases where the independence is proved by computation proceeding from 
previous assumptions conceming the probability distributions of the 
events and the random variables under study. 

In general, investigation of the formai structure of the mathematical 
apparatus of the theory of probability has led to interesting results. It turns 
out that this apparatus occupies a very definite and clear-cut place in the 
classification, which nowadays is gradually becoming clear in outline, of 
the basic objects of study in contemporary mathematics. 

We hâve already spoken of the concepts of intersection AB and union 
A u B of the events A and B. We recall that events are called mutually 
exclusive if their intersection is empty, i.e., if AB = N, where N is the 
Symbol for an impossible event. 

The basic axiom of the elementary theory of probability consists of the 
requirement (cf. §2) that under the condition AB = Nwe hâve the équation 

P (A u B) = P(/l) + P(fi). 

The basic concepts of the theory of probability, namely random events 
and their probabilités, are completely analogous in their properties to 
plane figures and their areas. It is sufficient to understand by AB the 
intersection (common part) of two figures, by A u B their union, by N 
the conventional “empty” figure, and by P(^) the area of the figure A, 
whereupon the analogy is complété. 


* This is also valid, for any n, for the digits a„ in the expansion of the number f 
in the fraction 

( = ~ + ~,+^+ 

n n 1 n s 
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The same remarks apply to the volumes of three-dimensional figures. 

The most general theory of entities of such a type, which contains as 
spécial cases the theory of volume and area, is now usually called measure 
theory, discussed in Chapter XV in connection with the theory of functions 
of a real variable. 

It remains only to note that in the theory of probability, in comparison 
with the general theory of measure or in particular with the theory of area 
and volume, there is a certain spécial feature: A probability is never greater 
than one. This maximal probability holds for a necessary event U. 

P (U) = 1. 

The analogy is by no means superficial. It turns out that the whole 
mathematical theory of probability from the formai point of view may be 
constructed as a theory of measure, making the spécial assumption that 
the measure of “the entire space” U is equal to one.* 

Such an approach to the matter has produced complété clarity in the 
formai construction of the mathematical theory of probability and has also 
led to concrète progress not only in this theory itself but in other théories 
closely related to it in their formai structure. In the theory of probability 
success has been achieved by refined methods developed in the metric 
theory of functions of a real variable and at the same time probabilistic 
methods hâve proved to be applicable to questions in neighboring domains 
of mathematics not “by analogy,” but by a formai and strict transfer of 
them to the new domain. Wherever we can show that the axioms of the 
theory of probability are satisfied, the results of these axioms are appli¬ 
cable, even though the given domain has nothing to do with randomness 
in the actual world. 

The existence of an axiomatized theory of probability préserves us from 
the temptation “to define” probability by methods that claim to construct 
a strict, purely formai mathematical theory on the basis of features of 
probability that are immediately suggested by the natural sciences. Such 
définitions roughly correspond to the “définition” in geometry of a point 
as the resuit of trimming down a physical body an infinité number of 
times, each time decreasing its diameter by a factor of 2. 

With définitions of this sort, probability is taken to be the limit of the 
frequency as the number of experiments increases beyond ail bounds. 
The very assumption that the experiments are probabilistic, i.e., that the 
frequencies tend to cluster around a constant value, will remain valid (and 


* Nevertheless, because of the nature of its problems, the theory of probability 
remains an independent mathematical discipline; its basic results (presented in detail 
in §3) appear artificial and unneœssary from the point of view of pure measure theory. 
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the same is true for the “randomness” of any particular event) only 
if certain conditions are kept fixed for an unlimited time and with absolute 
exactness. Thus the exact passage to the limit 



cannot hâve any objective meaning. Formulation of the principle of 
stability of the frequencies in such a limit process demands that we define 
the allowable methods of setting up an infinité sequence of experiments, 
and this can only be done by a mathematical fiction. This whole conglom¬ 
ération of concepts might deserve serious considération if the final resuit 
were a theory of such distinctive nature that no other means existed of 
putting it on a rigorous basis. But, as was stated earlier, the mathematical 
theory of probability may be based on the theory of measure, in its present- 
day form, by simply adding the condition 

P(t/)= I. 

In general, for any practical analysis of the concept of probability, there 
is no need to refer to its formai définition. It is obvious that concerning 
the purely formai side of probability, we can only say the following: The 
probability P (A/S) is a number around which, under conditions 5 deter- 
ming the allowable manner of setting up the experiments, the frequencies 
hâve a tendency to be grouped, and that this tendency will occur with 
greater and greater exactness as the experiments, always conducted in 
such a way as to preserve the original conditions, become more numerous, 
and finally that the tendency will reach a satisfactory degree of reliability 
and exactness during the course of a practicable number of experiments. 

In fact, the problem of importance, in practice, is not to give a formally 
précisé définition of randomness but to clarify as widely as possible the 
conditions under which randomness of the cited type will occur. One must 
clearly understand that, in reality, hypothèses concerning the probabilistic 
character of any phenomenon are very rarely based on immédiate 
statistical vérification. Only in the first stage of the pénétration of prob¬ 
abilistic methods into a new domain of science has the work consisted of 
purely empirical observation of the constancy of frequencies. From §3, 
we see that statistical vérification of the constancy of frequencies with an 
exactness of « requires a sériés of experiments, each consisting of 
n = 1 /c* trials. For example, in order to establish that in a given concrète 
problem the probability is defined with an exactness of 0.0001, it is neces- 
sary to carry out a sériés of experiments containing approximately 
100,000,000 trials in each. 
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The hypothesis of probabilistic randomness is much more often intro- 
duced from considérations of symmetry or of successive sériés of events, 
with subséquent vérification of the hypothesis in some indirect way. For 
example, since the number of molécules in a finite volume of gas is of the 
order of 10 20 or more, the number y/n, corresponding to the probabilistic 
déductions made in the kinetic theory of gases, is very large, so that many 
of these déductions are verified with great exactness. Thus, the pressures 
on the opposite sides of a plate suspended in still air, even if the plate is 
of microscopie dimensions, turn out exactly the same, although an excess 
of pressure on one side of the order of a thousandth of one per cent can 
be detected in a properly arranged experiment. 


§5. Deterministic and Random Processes 

The principle of causal relation among phenomena finds its simplest 
mathematical expression in the study of physical processes by means of 
differential équations as demonstrated in a sériés of examples in §1 of 
Chapter V. 

Let the State of the System under study be defined at the instant of time t 
by n parameters 

*i. x ,, —, x n . 

The rates of change of these parameters are expressed by their dérivatives 
with respect to time 



If it is assumed that these rates are functions of the values of the para¬ 
meters, then we get a System of differential équations 

=fi(x t , *t, -, *„), 

•** = ./î(-*l > x t 1 '"» x n)i 

*n = /«(* 1,*2, —,X n ). 

The greater part of the laws of nature discovered at the time of the birth 
of mathematical physics, beginning with Galileo’s law for falling bodies, 
are expressed in just such a manner. Galileo could not express his discovery 
in this standard form, since in his time the corresponding mathematical 
concepts had not yet been developed, and this was first done by Newton. 

In mechanics and in any other fields of physics, it is customary to express 
these laws by differential équations of the second order. But no new 
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principles are involved here; for if we dénoté the rates x k by the new 
symbols 

= x k , 

we get for the second dérivative of the quantities x k the expressions 


d*x k 
dt * 


= v k . 


and the équations of the second order for the n quantities x, , x 2 , —, x„ 
become équations of the first order for the 2 n quantities x,, —, x n , 

—, v n . 

As an example, let us consider the fall of a heavy body in the atmosphère 
of the earth. If we consider only short distances above the surface, we may 
assume that the résistance of the medium dépends only on the velocity 
and not on the height. The State of the System under study is characterized 
by two parameters: the distance z of the body from the surface of the earth, 
and its velocity v. The change of these two quantities with time is defined 
by the two differential équations 


i = - v, 
v = g -/(»). 


(31) 


where g is the accélération of gravity and f(v) is some “law of résistance” 
for the given body. 

If the velocity is not great and the body is sufficiently massive, say a 
stone of moderate size falling from a height of several meters, the résistance 
of the air may be neglected and équations (31) are transformed into the 
équations 


If it is assumed that at the initial instant of time t 0 the quantities z and v 
hâve values z 0 and v 0 , then it is easy to solve équations (32) to obtain the 
formula 

z = z„-lit - t 0 ) - g , 


which describes the whole process of falling. For example, if /„ = 0, 
v 0 = 0 we get 



found by Galileo. 
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In the general case, the intégration of équations (31) is more difficult, 
although the basic resuit, with very general restrictions on the function 
f(v), remains the same: Given the values z 0 and v 0 at the initial instant t 0 , 
the values of z and v for ail further instants t are computed uniquely, up 
to the time that the falling body hits the surface of the earth. Theoretically, 
this last restriction may also be removed, if we assume that the fall is 
extended to négative values of z. For problems set up in this manner, 
the following may be established: If the function f(v) is monotone 
for increasing v and tends to infinity for v -*■ oo, then if the fall continues 
unchecked, i.e., for unbounded growth of the variable t, the velocity v 
tends to a constant limiting value c, which is the solution of the équation 

g =f(c). 

From the intuitive point of view, this resuit of the mathematical analysis 
of the problem is quite understandable: The velocity of fall increases up 
to the time that the accélérâtive force of gravity is balanced by the résist¬ 
ance of the air. For a jump with an open parachute, the stationary velocity 
v of about five meters per second is attained rather quickly.* For a long 
jump with unopened parachute the résistance of the air is less, so that 
the stationary velocity is greater and is attained only after the parachutist 
has fallen a very long way. 

For the falling of light bodies like a feather tossed into the air or a bit 
of fluff, the initial period of accélération is very short, often quite 
unobservable. The stationary rate of falling is established very quickly, 
and to a standard approximation we may consider that throughout the 
fall v = c. In this case we hâve only one differential équation 

i = - c, 


which is integrated very simply: 

z = z 0 -c(t- l 0 ). 

This is how a bit of fluff will fall in perfectly still air. 

This deterministic conception is treated in a completely general way in 
the contemporary theory of dynamical Systems, to which is dedicated a 
sériés of important Works by Soviet mathematicians, N. N. Bogoljubov, 
V. V. Stepanov, and many others. This general theory also includes as 
spécial cases the mathematical formulation of physical phenomena in 
which the State of a System is not defined by a finite number of parameters 


* This statement is to be taken in the sense that in practice v soon gets quite close 
to c. 
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as in the earlier case, but by one or more functions, for example, in the 
mechanics of continuous media. In such cases the elementary laws for 
change of State in “infinitely small” intervals of time are given not by 
ordinary but by partial differential équations or by some other means. 
But the features common to ail deterministic mathematical formulations 
of actual processes are: first, that the State of the System under study is 
considered to be completely defined by some mathematical entity w (a 
set of n real numbers, one or more functions, and so forth); and second, 
that the later values for instants of time t > i 0 are uniquely determined by 
the value a* 0 at the initial instant t 0 

<*> = F(t 0 , œ 0 , t). 

For phenomena described by differential équations the process of finding 
the function <f> consists, as we hâve seen, in integrating these differential 
équations with the initial conditions a* = o> 0 for t = l 0 . 

The proponents of mechanistic materialism assumed that such a 
formulation is an exact and direct expression of the deterministic character 
of the actual phenomena, of the physical principle of causation. According 
to Laplace, the State of the world at a given instant is defined by an infinité 
number of parameters, subject to an infinité number of differential 
équations. If some “universal mind” could Write down ail these équations 
and integrate them, it could then predict with complété exactness, 
according to Laplace, the entire évolution of the world in the infinité 
future. 

But in fact this quantitative mathematical infinity is extremely coarse 
in comparison with the qualitatively inexhaustible character of the real 
world. Neither the introduction of an infinité number of parameters nor 
the description of the State of continuous media by functions of a point 
in space is adéquate to represent the infinité complexity of actual events. 

As was emphasized in §3 of Chapter V, the study of actual events does 
not always proceed in the direction of increasing the number of parameters 
introduced into the problem; in general, it is far from expédient to complic- 
ate the eu which describes the separate “states of the System” in our 
mathematical scheme. The art of the investigation consists rather in 
finding a very simple space Q (i.e., a set of values of tu or in other words, 
of different possible States of the System),* such that if we replace the 
actual process by varying the point tu in a determinate way over this 
space, we can include ail the essential aspects of the actual process. 

* In the example given earlier of a falling body, the phase space is the System of 
pairs of numbers (z, v), i.e., a plane. For phase spaces in general, see Chapters XVII 
and XVIII. 
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But if from an actual process we abstract its essential aspects, we are 
left with a certain residue which we must consider to be random. The 
neglected random factors always exercise a certain influence on the course 
of the process. Very few of the phenomena that admit mathematical 
investigation fail, when theory is compared with observation, to show the 
influence of ignored random factors. This is more or less the State of 
aflairs in the theory of planetary motion under the force of gravity: The 
distance between planets is so large in comparison with their size that the 
idealized représentation of them as material points is almost perfectly 
satisfactory ; the space in which they are moving is filled with such dispersed 
material that its résistance to their motion is vanishingly small; the masses 
of the planets are so large that the pressure of light plays almost no 
rôle in their motions. These exceptional circumstances explain the fact 
that the mathematical solution for the motion of a System of n material 
points, whose “States” are described by 6 n parameters* which take into 
account only the force of gravity, agréés so astonishingly well with 
observation of the motion of the planets. 

Somewhat similar to the case of planetary motion is the flight of an 
ertillery Shell under gravity and résistance of the air. This is also one of 
the classical régions in which mathematical methods of investigation were 
comparatively easy and quickly produced great success. But here the rôle 
of the perturbing random factors is significantly larger and the scattering 
of the shells, i.e., their déviation from the theoretical trajectory reaches 
tens of meters, or for long ranges even hundreds of meters. These 
déviations are caused partly by random déviations in the initial direction 
and velocity, partly by random déviations in the mass and the coefficient 
of résistance of the shell, and partly by gusts and other irregularities in 
the wind and the other random factors governing the extraordinarily 
complicated and changing conditions in the actual atmosphère of the earth. 

The scattering of shells is studied in detail by the methods of the theory 
of probability, and the results of this study are essential for the practice 
of gunnery. 

But what does it mean, properly speaking, to study random events? 
It would seem that, when the random “residue” for a given formulation 
of a phenomenon proves to be so large that it can not be neglected, then 
the only possible way to proceed is to describe the phenomenon more 
aceurately by introducing new parameters and to make a more detailed 
study by the same method as before. 

But in many cases such a procedure is not realizable in practice. For 
example, in studying the fall of a material body in the atmosphère, with 


The three coordinales and the three componcnts of the velocity of each point. 
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account taken of an irregular and gusty (or, as one usually says, turbulent) 
wind flow, we would be required to introduce, in place of the two para- 
meters z and v, an altogether unwieldy mathematical apparatus to describe 
this structure completely. 

But in fact this complicated procedure is necessary only in those cases 
where for some reason we must détermine the influence of these residual 
“random” factors in ail detail and separately for each individual factor. 
Fortunately, our practical requirements are usually quite different; we 
need only estimate the total eflect exerted by the random factors for a 
long interval of time or for a large number of répétitions of the process 
under study. 

As an example, let us consider the shifting of sand in the bed of a river, 
or in a hydroelectric construction. Usually this shifting occurs in such a 
way that the greater part of the sand remains undisturbed, while only 
now and then a particularly strong turbulence near the bottom picks up 
individual grains and carries them to a considérable distance, where they 
are suddenly deposited in a new position. The purely theoretical motion 
of each grain may be computed individually by the laws of hydrodynamics, 
but for this it would be necessary to détermine the initial State of the 
bottom and of the flow in every detail and to compute the flow step by 
step, noting those instants when the pressure on any particular grain of 
sand becomes sufficient to set it in motion, and tracing this motion until 
it suddenly cornes to an end. The absurdity of setting up such a problem 
for actual scientific study is obvious. Nevertheless the average laws or, as 
they are usually called, the statistical laws of shifting of sand over river 
bottoms are completely amenable to investigation. 

Examples of this sort, where the eflect of a large number of random 
factors leads to a completely clear-cut statistical law, could easily be 
multiplied. One of the best known and at the same time most fascinating 
of these, in view of the breadth of its applications, is the kinetic theory of 
gases, which shows how the joint influence of random collisions of 
molécules gives rise to exact laws goveming the pressure of a gas on the 
wall, the diffusion of one gas through another, and so forth. 


§6. Random Processes of Markov Type 

To A. A. Markov is due the construction of a probabilistic scheme which 
is an immédiate generalization of the deterministic scheme of§5 described 
by the équation 

= F(t 0 , w„, t). 


It is true that Markov considered only the case where the phase space of 
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the System consists of a finite number of States Q = (ut ,, ut 2 , ■■■, u>„) and 
studied the change of State of the System only for changes of time t in 
discrète steps. But in this extremely schematic model he succeeded in 
establishing a sériés of fundamental laws. 

Instead of a function F, uniquely defining the State ut at time t > t 0 
corresponding to the State co 0 at time /„, Markov introduced the probabili¬ 
tés 

('o. w i) 


of obtaining the State ut, at time t under the condition that at time l 0 we 
had the State eu,. These probabilities are connected for any three instants 
of time 

to < h < h 

by a relation, introduced by Markov, which may be called the basic 
équation for a Markov process 

n 

P('o » <*>< ; h . <tt,) = 2) P(/.. ; 'i. <v*)P(r,, Ut k ; i t , Ut,). (33) 

When the phase space is a continuous manifold, the most typical case 
is that a probability density p(t 0 , <u 0 ; t, w) exists for passing from the 
State u> 0 to the State ut in the interval of time (r 0 , r)- In this case the 
probability of passing from the State o> 0 to any of the States ut belonging 
to a domain G in the phase space Q is written in the form 

P('o . ; t. G) = f p(t 0 , Ut 0 ; i, tu) dut, (34) 

J G 

where dut is an element of volume in the phase space.* For the probability 
density p(t 0 , tu 0 ; t, ut), the basic équation (33) takes the form 

p(t 0 , <*>o ; h . <"*) = [ rt'o. ; 1 1. ") p(t i . tu ; '*.<•**) dut. (35) 

Equation (35) is usually difficult to solve, but under known restrictions 
we may deduce from it certain partial differential équations that are easy 
to investigate. Some of these équations were derived from nonrigorous 
physical considérations by the physicists Fokker and Planck. In its 
complété form this theory of so-called stochastic differential équations 


* Properly speaking, équation (34) serves to define the probability density. The 
quantity p dw is equal (up to an infinitésimal of higher order) to the probability of 
passing in the time from i„ to t from the State <*> 0 to the clément of volume dm. 
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was constructed by Soviet authors, S. N. BernSteln, A. N. Kolmogorov, 
I. G. Petrovskil, A. Ja. Hinôin, and others. 

We will not give these équations here. 

The method of stochastic differential équations allows us, for example, 
to solve without difficulty the problem of the motion in still air of a very 
small body, for which the mean velocity c of its fall is significantly less 
than the velocity of the “Brownian motion” arising from the fact, because 
of the smallness of the particle, its collisions with the molécules of the 
air are not in perfect balance on its various sides. 

Let c be the mean velocity of fall, and D be the so-called coefficient of 
diffusion. If we assume that a particle does not remain on the surface of 
the earth (z = 0) but is “reflected”, i.e., under the influence of the 
Brownian forces it is again sent up into the atmosphère, and if we also 
assume that at the instant /„ the particle is at height z& then the probability 
density p(t 0 , z 0 ; t, z) of its being at height z at the instant t is expressed 
by the formula 


P( l o . î f,z) = 


1 


2 VnD(t - /„) 






g 4DII-1,1 _|_ g 4Dl(-I 0 ) 


4 D 


+ 


—-Vg e-**/® e-** dz. 
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In figure 4 we illustrate how the curves p(t 0 , z 0 ; t, z) may change for a 
sequence of instants t. 

We see that in the mean the height of the particle increases, and its 
position is more and more indefinite, more “random.” The most interesting 
aspect of the situation is that for any t„ and z 0 and for t -*• co 


p(t 0 , z 0 ; t,z)-*^ e-' l ' D ; 


(36) 


i.e., there exists a limit distribution for the height of the particle, and the 
mathematical expectation for this height with increasing t tends to a 
positive limit 

z» = ± ^ ze-“/° dz = ^. (37) 

So in spite of the fact that as long as our particle is above the surface of 
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the earth, it will always tend to fall because of the force of gravity, 
nevertheless, as this process (wandering in the atmosphère) continues, 
the particle will be found on the average at a definite positive height. If we 
take the initial z 0 smaller than z*, it will turn out that in a sufficiently great 
interval of time the mean position of the particle will be higher than its 
initial position, as is shown in figure 5, where z„ = 0. 




For individual particles the mean values z* under discussion here are 
only mathematical expectations, but from the law of large numbers it 
follows that for a large number of particles they will actually be realized: 
The density of the distribution in height of such particles will follow from 
the indicated laws, and, in particular, after a sufficient interval of time 
this density will become stable in accordance with formula (36). 

What has been said so far is immediately applicable only to gases, to 
smoke the like, which occur in the air in small concentrations, since the 
quantities c and D were assumed to be defined by a preassigned State of 
the atmosphère. However with certain complications, the theory is 
applicable to the mutual diffusion of the gases that compose the atmos¬ 
phère, and to the distribution in height of their densities arising from this 
mutual diffusion. 

The quotient c/D increases with the size of the particles, so that the 
character of the motion changes from diffusion to regular fall in accordance 
with the laws considered in §5. The theory allows us to trace ail transitions 
between purely diffusive motion and such laws of fall. 
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The problem of motion of particles suspended in a turbulent atmosphère 
is more difficult, but in prindple it may be handled by similar probabilistic 
methods. 
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CHAPTER 


XII 


APPROXIMATIONS 
OF FUNCTIONS 


§1. Introduction 

In practical life we are constantly faced with the problem of approxi- 
mating certain numbers by means of others. For example, our measure- 
ments of various concrète magnitudes, length, area, température, and so 
forth, lead us to numbers that are only approximations. In practice we 
use only rational numbers, i.e., numbers of the form p/q, where p and 
q(q ?£ 0) are integers. But, in addition to the rational numbers, the 
irrational numbers also exist, and although we do not use them in 
measuring, still our theoretical arguments often lead to them. We know, 
for example, that the length of the circumference of a circle of radius 
r = J is equal to the irrational number n, and the length of the hypoténuse 
of a right triangle with unit sides is equal to y/2. In actual computations 
with irrational numbers, one first of ail approximates them by rational 
numbers with a required degree of exactness, usually by means of a 
terminating décimal fraction. 

The same situation also occurs for functions. The quantitative laws of 
nature are expressed in mathematics by means of functions, not with 
absolute exactness, but approximately, with various degrees of précision. 
Further, in a vast number of cases we find it necessary, even for functions 
defined by completely mathematical rules, to approximate them by 
other functions with specified exactness so as to be able to compute 
them in practice. 

However, these remarks do not refer to computations only. The problem 
of defining a function by means of other functions has great theoretical 
importance. Let us illustrate in a few words. The development of mathe- 
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matical analysis has led to the discovery and study of very important 
classes of approximating functions that under known conditions hâve 
proved to be the natural means of approximating other more or less 
arbitrary functions. These classes turned out to be, above ail, the algebraic 
and trigonométrie polynomials, and also their various generalizations. 
It was shown that from the properties of the function to be approximated 
we may estimate, under certain conditions, the character of its déviation 
from a spécifie sequence of functions approximating it. Conversely, if 
we know how it deviates from its approximation by a sequence of 
functions, we can establish certain properties of the function. In this 
direction a theory of functions has been constructed that is based on 
their approximate représentation by various classes of approximating 
functions. There is a similar theory in the theory of numbers. In it the 
properties of irrational numbers are studied on the basis of their ap¬ 
proximations by rational numbers. 

In Chapter II the reader has already met one very important method 
of approximation, namely Taylor’s formula. With its help a function 
satisfying certain conditions is approximated by another function of the 
form P(x) = a 0 + a t x 4- ••• + a„x”, which is called an algebraic poly¬ 
nomial. Here the a k are constants, independent of x. 

An algebraic polynomial has a very simple structure; in order to 
compute it for given coefficients a k and given values of x we need to 
apply only the three arithmetic operations, addition, subtraction, and 
multiplication. The simplicity of this computation is extremely important 
in practice and is one of the reasons why algebraic polynomials are the 
most widespread means of approximating functions (another important 
reason is discussed later). It is sufficient to point out that especially at 
the présent time technical computations must be carried out on computing 
machines on a massive scale. In their présent State of perfection computing 
machines work very rapidly and tirelessly. However, machines can 
perform only relatively simple operations. They may be set to perform 
arithmetic operations on very large numbers, but never, for example, 
the infinité process of passage to the limit. A machine cannot compute 
log x exactly, but we can approximate log x by a polynomial P(x) with 
any required degree of accuracy, and then compute the polynomial by a 
machine. 

In addition to Taylor’s formula, there are others of great practical 
importance in the approximation of functions by algebraic polynomials. 
Among them are the various interpolation formulas, which are widely 
used, in particular, in approximate computation of intégrais, and also 
in approximate intégration of differential équations. Well known also is 
the method of approximation in the sense of the mean square, which is 
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very widely used with other functions as well as algebraic polynomials. 
For certain practical questions great importance is attached to the method 
of best uniform (or CebySev) approximation, originated by the great 
Russian mathematician CebySev, a method which arose, as we will see, 
from the. solution of a problem connected with the construction of 
mechanisms. 

Our présent purpose is to give the reader some idea of these methods 
and, as far as possible, to State the conditions under which one method 
is préférable to another. No one of them is absolutely the best. Every 
method can be seen to be better than the others under certain conditions. 
For example, if we hâve a physical problem to solve, then some one 
method of approximating the functions that occur in the problem is 
particularly indicated by the character of the problem itself or, as one 
says, by physical considérations. Also we will see later that under well- 
known conditions one method of approximation may be applicable, and 
another not. 

Each of the methods of computation arose in its own time and has 
its own characteristic theory and history. Newton was already familiar 
with a formula for interpolation and gave it a very convenient form for 
practical computation with what are called différence quotients. The 
method of approximation in the sense of the mean square is at least 
150 years old. But, for a long time these methods did not give rise to a 
connected theory. They were only various practical methods of approxi¬ 
mating functions, and furthermore, the restrictions on their applicability 
were not clear. 

The présent theory of approximations to functions arose from the work 
of CebySev, who introduced the important concept of best approximation, 
in particular best uniform approximation, made systematic use of it in 
practical applications and developed its theoretical basis. Best approxi¬ 
mation is the fundamental concept in the contemporary theory of 
approximation. After CebySev, his ideas were developed further by his 
students E. I. Zolotarev, A. N. Korkin, and the brothers A. A. and 
V. A. Markov. In the CebySev period of the theory of approximation 
of functions, not only were the fundamental concepts introduced, but 
basic methods were found for obtaining the best approximations to 
arbitrary individual functions, methods which are in wide use at the 
présent time; also, there were basic investigations of the properties of 
the approximating classes, particularly of algebraic and trigonométrie 
polynomials, from the point of view of the requirements arising from 
practical problems. 

The further development of the theory of approximation of functions 
was infiuenced by an important mathematical discovery, made at the 
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end of the last century by the German mathematician Weierstrass. With 
complété rigor he proved the theoretical possibility of approximating 
an arbitrary continuons function by an algebraic polynomial with any 
given degree of accuracy. This is the second reason why algebraic poly- 
nomials are a universal means of approximating functions. The mere 
simplicity of construction of algebraic polynomials is not sufficient; we 
also require the possibility of approximating any continuons function by 
a polynomial with arbitrary prescribed error. This possibility was proved 
by Weierstrass. 

The profound ideas of CebySev on best approximation and the theorem 
of Weierstrass served as a basis, at the beginning of the présent century, 
of the present-day development in the theory of approximation. In this 
connection let us mention the names of S. N. BernSteln, Borel, Jackson, 
Lebesgue, and de la Vallée-Poussin. Briefly, this development may be 
described as follows. Up to the time of CebySev (the beginning of the 
présent century), the problems usually consisted of approximation of 
individual functions, but the characteristic problem of the present-day 
period is the approximation, by polynomials or otherwise, of entire 
classes of functions, analytic, différentiable, and the like. 

The Russian school, and now the Soviet school, of the theory of 
approximation has played a leading rôle in this theory. Important 
contributions hâve been made by S. N. BernSteln, A. N. Kolmogorov, 
M. A. Lavrent’ev, and their students. At the présent time the theory 
has developed into an essentially distinct branch of the theory of functions. 

In addition to algebraic polynomials, another very important means of 
approximation consists of the trigonométrie polynomials. A trigonométrie 
polynomial of order n is a function of the form 


u„(x) = a 0 + a, cos x + PiSinx + a, cos 2x + sin 2x + 

••• + a„ cos nx -f f}„ sin nx, 

or more concisely 

n 

u„(x) = <*„ + 2) (“* cos kx + sin kx). 


where a k and /3* are constants. 

There are various particular methods of approximation by trigonométrie 
polynomials, which are usually connected in a rather simple way with 
the corresponding methods of approximation by algebraic polynomials. 
Among these methods an especially important rôle is played by the 
expansion of functions in a Fourier sériés (see §7). These sériés are known 
by the name of the French mathematician Fourier, who at the beginning 
of the last century made several theoretical discoveries concerning them, 



§2. INTERPOLATION POLYNOMIALS 


269 


in his study of the conduction of heat. However, it should be noted that 
trigonométrie sériés were investigated as early as the middle of the 
18th century by the great mathematicians Leonhard Euler and Daniel 
Bernoulli. In Euler’s work they were related to his researches in as- 
tronomy, and in Bernoulli’s to his study of the oscillating string. We may 
remark that Euler and Bernouilli raised the fundamentally significant 
question of the possibility of representing a more or less arbitrary function 
by a trigonométrie sériés, a question which was finally answered only in 
the middle of the last century. Its affirmative answer, discussed later, 
was anticipated by Bernoulli. 

Fourier sériés are of great importance in physics, but we will give 
little attention to this aspect of them, since it has been considered in 
Chapter VI. In that chapter also the reader will find examples of physical 
problems that naturally lead to the expansion of a given function in 
sériés other than the trigonométrie sériés but with great similarly to them. 
We refer to the so-called sériés of orthogonal functions. 

Fourier sériés hâve had a history of two hundred years. So it is not 
surprising that by now their theory is extraordinarily broad, subtle, and 
profound and constitutes an independent discipline in mathematics. An 
especially remarkable rôle in this theory has been played by the Moscow 
school of the theory of functions of a real variable, N. N. Luzin, A. N. 
Kolmogorov, D. E. Men’Sov, and others. 

We note also that the significance of trigonométrie polynomials in 
contemporary mathematics is hardly exhausted by their rôle as methods 
of approximation. For example, in Chapter X the reader became ac- 
quainted with the fundamental results of I. M. Vinogradov in the theory 
of numbers, which were derived on the basis of a suitably devised apparatus 
of trigonométrie sums (polynomials). 


§2. Interpolation Polynomials 

A spécial case of the construction of interpolating polynomials. In 

practical computations the interpolation method of approximating a 
function is widely used. To introduce the reader to a range of questions 
of this type, we consider the following elementary problem. 

Let the function y = f(x) be given on the interval [x 0 , x 2 ], with graph 
as illustrated in figure 1. The appearance of this graph is reminiscent of 
an arc of a parabola. So if we wish to approximate our function by a 
simple function, it is natural to choose a polynomial of the second degree 

P(x) = a 0 + a t x + a 2 x 2 , (1) 

the graph of which is a parabola. 
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The interpolation method consists of the following. In the interval 
[*o » we choose an interior point x, . The points x„, x,, x 2 give cor- 
responding values of our function 

y 0 = f(x „), y x = /(x,), y 2 = /(x*). 

We construct a polynomial (1) such that at the points x 0 , Xj, x 2 it 
agréés with the function in question (its graph is shown by the dashed 

curve in figure 1). In other 
words, we must choose the 
coefficients a 0 , a,, a 2 in the 
polynomial (1) so that they 
satisfy the équations 

p ( x o) = y 0 , P(x ,) = y x , 

P(x 2 ) = y*. (2) 

We note that our function 
/(x) may be defined otherwise 
than by a formula; for example, 
its values may be given em- 
pirically as shown by the graph in figure 1. To solve the interpolation 
problem, we choose an approximating function in the form of an analytic 
expression, namely the polynomial P(x). If the exactness of the approxi¬ 
mation is satisfactory, the polynomial so chosen has the advantage over 
the original function that we can compute its intermediate values. 

This interpolation problem could be solved as follows: We could set up 
the three équations 

v« = a 0 + û,x 0 + fl*xj , 

Ti = a o + <* 1*1 + °tx \. 

.v 2 = a 0 + a,x 2 -f a 2 x 2 , 



solve them for a „, u,, a 2 and substitute the values of these coefficients 
in équation (1). But let us solve it in a somewhat different way. We begin 
by constructing the polynomial (? 0 (x) of the second degree such that it 
satisfies the three conditions: Q 0 (x 0 ) = 1, (? 0 (*i) = 0, 0 o (x 2 ) = 0. From 
the last two conditions it follows that this polynomial must hâve the form 
A(x — x,)(x — x^, and from the first condition that 


(*o - *iX*o - x ù 
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OAx) = (* — 

(x 0 x i)( x o x t) 


Similarly the polynomials 

_ (x - x 0 Xx - x 2 ) _ (* - x„)(x - x x ) 

' ) (x, - x 0 )(x, - x*) ’ ’ (x 2 - x 0 )(x 2 - x.) 

satisfy the conditions 

<?.(*„) = (?,(*) = 0, Ô,(x,) = I, 

Ô,(*o) = Ô*(*,) = 0, Q^ Xt ) = 1. 

Further, it is obvious that the polynomial y 0 Q 0 (x) has the value y 0 for 
x = x„ and vanishes for x = x, and x = x 2 , and corresponding proper- 
ties hold for the polynomials y x Q x (x) and y 2 0 2 (x). 

Hence it readily follows that the desired interpolating polynomial is 
given by the formula 


P(x) = y a Qo(x) + y x Qx(x) + y t QM 

= (X - x,)(x - x 2 ) (x - x 0 )(x - xJ 

y ° (x 0 - X x )(x„ - Xt) y ' (x, - x 0 )(x, - Xj) 

+ (x - x„X* - x x ) 

y *(x t + x 0 )(x 2 - X,) ' 


( 3 ) 


We note that the polynomial so obtained is the unique polynomial of 
the second degree which solves our interpolation problem. For if we 
assume that some other polynomial P x (x) of the second degree is also 
a solution of the problem, then the différence P x (x) — P(x), which is also 
a polynomial of the second degree, vanishes at the three points x = x 0 , 
x,, x 2 . But we know from algebra that if a polynomial of the second 
degree vanishes for three values of x, then it is identically zéro. So the 
polynomials P(x) and P x (x) agréé identically. 

It is clear that in general the polynomial so obtained agréés with the 
given function only at the points x 0 , x,, x 2 and differs from it for other 
values of x. 

If we take x, at the center of the interval [x 0 , x 2 ] and put x 2 — x, 
= x, — x 0 = h, then formula (3) is somewhat simplified: 


P(x) = W Mx - x x )(x - xJ - 2y x (x - x 0 )(x - x*) + y^x - x„)(x - x,)]. 
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As an example let us interpolate the sine curve y = sin x (figure 2) 
by a polynomial of degree two, agreeing with it at the points x = 0, n/2, n. 
Obviously, the desired polynomial has the form 

4 

P(x) = —— x(n — x) «s sin x. 




Let us compare sin x and P(x) at two intermediate points: 


P (—) = 0.75, while sin \ ^ « 0.71, 

\ 4 / 4 2 

ni n \ ... .TT 9 

Wl,,,e S ' n 6 = Ï8 


In this way we hâve approximated sin x on the interval [0, n\ with an 
accuracy* of about 0.05. On the other hand, the expansion of sin x in a 
Taylor sériés around the point n/2 gives 


sin x = cos 





If we stop at the second term of the expansion, we hâve at the point 
x = 0, the approximation sin 0 = 1 — w 2 /8 0.234, i.e., an error greater 

than 0.2. 

We see that our interpolation method has produced an approximation 
to sin x on the whole interval [0, n] by a polynomial of degree two that 

* However, for a complété justification of this statement, we nced to prove that the 
différence (4x/»r*)(ir — x) — sin x does not exceed in value 0.05, not only for x = »/4 
and x = ir/6, but also for ail x on the interval (0, »]; we will not do this. 
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is more satisfactory than the Taylor-expansion of second degree. However, 
we must not forget that Taylor’s formula gives a very exact approximation 
close to the point x = rr/2 around which it is taken, more exact in this 
neighborhood than the approximation obtained by interpolating. 

The general solution of the problem. It is clear that a more com- 
plicated function .v — /(.v), as illustrated in figure 3, is hardly suitable 
for approximation by a polynomial of degree two, since no parabola 
of degree two could follow ail the bends of the curve y = /(*). In this 
case it is natural to try an interpolation of the function with a polynomial 
of higher degree (not less than the fourth). 

The general problem of interpolation consists of constructing a poly¬ 
nomial P(x) = a 0 + a ,x + a t x r* + ••• 4- a n x n of degree n which agréés 
with a given function at n + 1 équations: 


P(x 0 ) =/(*„), P(x x )=J (*,).... , P(x n ) = /(*„)• 


The points at which it is required that the function agréé with its approxi- 
mating polynomial are called the points of interpolation. 

Reasoning in the same way as for a second-degree polynomial, we can 
easily prove that the desired polynomial may be written in the form 


P ( , = Ÿ (g - *oX* - *i) - (v - - Xk+i) -(.x -x„) f 

nK } &(xt - X 0 )(x k - Xl )-(x k - x k _,)(x k - x k+l ) -(x k - x n ) JK th 

(4) 

and further that this polynomial (of degree n) is unique. The formula 
so written is known as Lagrange's formula. It may also be put in various 
other forms; for example, it is widely used in practice in the form involving 
Newton’s différence quotients. 

The déviation of the interpolation polynomial from the generating function. 

The method of interpolation is a universal means of approximating 
functions. In principle, the function is not required to hâve any particular 
properties for interpolation to be possible; for example, it is not required 
to hâve dérivatives over the whole interval of approximation. In this 
respect the method of interpolation has an advantage over Taylor’s 
formula. It is interesting to note that there are cases when the function 
is even analytic at every point on an interval but cannot be approximated 
by its Taylor's formula over the interval. Suppose, for example, that we 
require a good approximation of the function 1/(1 + x 2 ) on the interval 
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[ —2, 2] by means of an algebraic polynomial. At first glance it is natural 
to try its expansion in a Taylor sériés about the point x = 0 

-rb- 

But it is easy to see that this sériés is convergent only in the interval 
—1 < x < 1. Outside the interval [—1, 1], it diverges and consequently 
cannot approximate 1/(1 + x*) on the whole interval [—2,2], Nevertheless, 
the interpolation method is completely applicable here. 

Of course, the question arises in each case of choosing the number 
and distribution of the points of interpolation in such a way that the 
error will satisfy certain requirements. For functions with dérivatives of 
sufficiently high order, the answer to this question of the possible 
magnitude of error is given by the following classical resuit, which we 
introduce without proof. 

If on the interval [x 0 , x„) the function /(x) has a continuous dérivative 
of order n + 1, then for any intermediate value of x the déviation of 
/(x) from the Lagrange interpolation polynomial P(x) with points of 
interpolation x 0 < x, < — < x„ is given by the formula 


A*) - P(x) = 


(x - XqKx - x,) - (x - x„) 
n\ 




where c is an intermediate point between x 0 and x„ . This formula is 
reminiscent of the corresponding formula for the remainder term in the 
Taylor expansion and is essentially a generalization of it. So, if it is known 
that the dérivative/<" +n (x) of order n -F 1 on the interval [x 0 , x„] nowhere 
exceeds the number M in absolute value, then the error of the approxi¬ 
mation for any value of x on this interval is bounded by the following 
estimate: 

|/(x) - P„(x)\ < I* - , *ol -I* -*» 1 M . 

ni 


The contemporary theory of approximation provides many other 
methods of estimating the error in interpolation. This question has been 
carefully studied and some interesting, completely unexpected facts hâve 
been discovered. 

Consider, for example, a smooth function y = /(x), defined on the 
interval [—1, 1], i.e., one whose graph is a continuous curve with a 
continuously varying tangent. Our choice of the interval with spécifie 
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end points —1 and 1 is unimportant; the facts described here remain 
valid for an arbitrary interval [a, b] with inconsequential changes. 

We assume now that on the interval [—1, 1] we hâve chosen a System 
of n + 1 points 

—1 < * 0 < x, < — < < 1 (5) 

and hâve then constructed the polynomial P(x) = a 0 + o t x + ••• + a„x n 
of degree n that agréés with f(x) at these points. We will assume tempo- 
rarily that the points of the System (5) are equally spaced along the 
interval. If n increases indefinitely, then the corresponding interpolating 
polynomial P n (x) will agréé with f(x) at a greater and greater number 
of points, and we might think that at an intermediate point x, not belonging 
to the System (5), the différence f(x) — P„(x) would converge to zéro as 
n -* oo. This opinion was held even at the end of the last century, but 
it was afterwards discovered that the facts are far otherwise. It has been 
shown that for many smooth (even analytic) functions /(*), in the case 
of evenly spaced points of division x k , the interpolating polynomials 
P n (x) do not at ail converge to /(*) as n -*■ oo. The graph of the inter¬ 
polating polynomial certainly agréés with f(x) at the given points of 
interpolation, but in spite of this it deviates strongly for large n from 
the graph of J\x) at intermediate values of x and the déviation increases 
with increasing n. As further investigation showed, this situation may be 
avoided, at least for smooth functions, if the points of interpolation are 
distributed more sparsely near the center of the interval and more 
densely near the ends. Indeed, it has been shown that in a well-known 
sense the best distribution of the points of interpolation is the one in 
which the points x k occur at the zéros* of the CebySev polynomials 
cos [(n + 1) arc cos x) defined by the formula 

2k + 1 zi a i \ 

X k = cos 2(/t - +1 y " (* = 0. 1. —. «)• 

The polynomials (called CebySev polynomials) which correspond to 
these points of interpolation hâve the property that they are uniformly 
convergent to the function which generated them, provided the latter is 
smooth, i.e., is itself continuous and has a continuous first dérivative. 
The graph of such a function is a continuous curve with a continuously 
varying tangent. Figure 4 shows the distribution of the zéros of the 
CebySev polynomial for the case n = 5. 


* A zéro of the function f(x) is a value x k for which /(x») = 0. For details on 
Cebysev polynomials see §5. 
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As for arbitrary nonsmooth continuous functions, the situation is 

worse; it can be shown that 
in general there is no sé¬ 
quence of points of inter¬ 
polation such that the inter- 
polating process will con¬ 
verge for any continuous 
function (Faber’s theorem). 
In other words, however, 
we may divide the interval 
[—1, 1] into parts, with the 
number of points of inter¬ 
polation approaching infin- 
ity, we can always find a 
function /(*), continuous in 
the interval, such that the 
successive polynomials with these points of interpolation will not converge 
to the function. Even for the mathematicians of the middle of the last 
century, this fact, had it been known, would hâve sounded paradoxical. 
Of course, the explanation is that among the continuous nonsmooth 
functions there are some extraordinarily “bad” ones, for example those 
which do not hâve a dérivative at any point of the interval on which 
they are defined, and these supply examples for which a given interpolation 
process will not converge. Effective methods of approximation to these 
functions by polynomials can be suggested by making some changes in 
the previous interpolation process, but we will not take the time to do 
this here. 

In conclusion we note that algebraic polynomials are not the only 
means available for interpolation. There are methods for interpolation by 
trigonométrie polynomials, for example, which are well developed from 
the practical and also from the theoretical point of view. 



§3. Approximation of Definite Intégrais 

Interpolation of functions has wide application in questions related to 
the approximate computation of intégrais. As an example, we introduce 
an approximate formula for a definite intégral, namely Simpson’s rule, 
which is widely used in applied analysis. 

Let it be required to compute an approximation to the definite intégral 
on the interval [a, 6] of the function /(*), whose graph is illustrated in 
figure 5. The exact value is given by the area of the curvilinear trapezoid 
aABb. Let C be the point of the graph with abscissa <• = (a + b)/2. 
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Through the points A, B, and C, we pass a parabola of degree two. As 
we know from the preceding 
section, this parabola is the 
graph of a polynomial of the 
second degree, defined by the 
équation 

P{x) = [(* - cX* - b)y 0 

- 2(x - àKx - b)y t 
+ (x - a)(x - c)y J, 

where Fig. 5. 

h = b -^- y 0 = /(a), yi = f(c), y* = f(b). 

In the terminology of the preceding section, we may say that the 
second-degree polynomial P(x) interpolâtes /(x) at the points with 
abscissas a, c, b. If the graph of the function f(x) on the interval [a, b) 
does not change too violently and the interval is not large, then the 
polynomial P(x) will everywhere diflfer little from f(x); this, in turn, 
implies that their intégrais taken over [a, b] will also differ little from 
each other. On this basis we may assume these intégrais are approximately 
equal, 

jV(*) dx « P(x) dx, 

or, as it is customarily stated, the second intégral is an approximation 
to the first. Simple computations, which we leave to the reader, show that 

| (x — c)(x — b) dx = ^ ffi, — J (x — o)(x — b) dx = ^ h 3 , 

J (x - a)(x -c)dx = i 3 . 

Hence 

J* Pi*) dx = * l f(a) + 4(/c) +f(b)). 

Thus the definite intégral may be computed by the following approxi¬ 
mation formula: 

J*y(x) dx** h - ma) + 4 m +mi 



This is Simpson’s formula. 
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As an example, let us use this formula to compute the intégral of sin x 
on the interval [0, n]. In this case 

h = - 2 , Ka) = sin 0 = 0, /(c) = sin ^ = 1, f{b) = sinw = 0, 


and consequently (h/3)[f(a) + Af(c) +f(b)] = = 2.09 •••. On the other 

hand the intégral can be found exactly 


f. 


sin x dx = — cos x 


= 2 . 


The error does not exceed 0.1. 

If the interval [0, n] is decomposed into two equal parts and on each 
of these our formula is applied separately, then we get 


Ç sin x dx as ^ [sinO + 4 sin ^ + sin = ^ ( 4 ** 1-001. 

f sin x dx fa 1.001. 

J */t 

In this manner 

f sin x dx fa 2.002 ; 

n 


and now the error is considerably less than 0.002. 

In practice, in order to compute approximately the definite intégral 
of a function f(x) on [a, b) we divide the interval into an even number n 
of parts by the points a = x 0 < x t < ■•• < x„ = b and successively 
apply Simpson’s rule to the segment [x 0 , xJ, and then to the segment 
[x 2 ,xj and so forth. As a resuit we hâve the following general formula of 
Simpson: 


J f(x) dx fa b -—^ [/(* 0 ) + 4/(x,) + 2/(Xj) + 4/(x s ) + — +/(x.)]. 

J “ 3 " ( 6 ) 

Let us now give without proof the classical estimate for the error. If on 
the interval [a, b] the function f{x) has a fourth dérivative which satisfies 
the inequality |/ lv (x) | < M, then the following estimate holds 


| Çf(x)dx-Uf) 


< m - a ? 

" 180/i 4 


( 7 ) 


Here by L(J) we dénoté the right side of formula (6). In this case the 
error will be of order n~*.* 


* If a certain quantity o«, depending on n = 1,2, —, satisfies the inequality 
| o, | < C/n‘, where C is constant independent of n, then we say that it is of order n~*. 
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We could hâve decomposed the interval [a, b] into n equal parts and 
taken as our approximation to the intégral the sum of the areas of the 
rectangles drawn in figure 6. Then we would get an approximation formula 
from the rectangles* 

f f{x) dx b —^- [/(* 0 ) +/(*!> + — +/(*„_,)] • (8) 

J a H 


il 

II 

>0 X, x z 



*n-t , 


Fig. 6. 


It may be shown that the order of error here is n~*, provided the function 
has a second dérivative that is bounded on the interval [a, A]. We may 
also take as an approximation the sum of the areas of the trapezoids 
drawn in figure 7 and get the trapezoidal formula 

J/00 dx as ^ [f(x 0 ) + 2/Or,) + - + 2Ax n . x ) +/(x„)J (9) 

with order of error n -2 , provided the function has a bounded second 
dérivative. 

It is usually said that Simpson's formula is more exact than the 
trapezoidal and rectangular formulas. This statement requires amplifi- 



* In this case x, , x,, —, are the centers of the equal parts of the interval [a, b], 
and not points of division as in formulas (6) and (9). 
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cation, without which it will not be true. If we know only that a function 
has a first dérivative, then the guaranteed order of approximation for 
each of the three methods is alike equal to n~ l ; in this case Simpson’s 
formula has no essential advantage over the rectangular and trapézoïdal 
formulas. For functions that hâve a second dérivative, it is guaranteed 
that the approximations by the trapezoidal formula and by Simpson 
formulas are each of order nr 2 . But if the function has a third and fourth 
dérivative, then the order of the error is still equal to nr 2 for the rectan¬ 
gular and trapezoidal formulas, but for Simpson’s formula it is equal to 
n~ 3 and nr* respectively. But the order n ~ 4 for Simpson’s formula proves 
in its turn to be the best possible resuit; in other words, for functions 
that hâve dérivatives of higher order than the fourth, the order of error 
remains equal to n~*. Thus, if we are given a function that has a dérivative 
of fifth order and wish to make use of this fact to obtain an approximation 
of order rr\ we need a new method of approximation to the definite 
intégral, different from Simpson’s formula. To explain how it must be 
constructed, we note the following. 

The trapezoidal and rectangular formulas, as is easily shown, are 
exact for polynomials of the first degree; this means that substitution in (9) 
of the function A + Bx, where A and B are constants, leads to exact 
equality. In the same way Simpson’s formula proves to be exact for 
polynomials of the third degree A + Bx + Cx* + Dx 3 . The gist of the 
matter lies in this fact. Let us suppose that we hâve divided the interval 
[a, b] into n equal parts and on each part hâve used a method of ap¬ 
proximation, the same on each part, which is exact for polynomials 
A + Bx + ••• + Fx m ~ l of degree m — 1. Then the error of the approxi¬ 
mation for every function which has a bounded mth dérivative will be 
of order n~ m , and if this function is not a polynomial of degree m — 1, 
then this order cannot be increased even for functions which hâve 
dérivatives of much higher order. 

Our présent remarks emphasize the importance of finding the simplest 
possible approximate methods of intégration that are exact for poly¬ 
nomials of a given degree. This question, on which the present-day 
literature is quite large, has interested mathematicians for a long time. 
Here we can only refer to certain classical results. 

Let the function p(x) be given. We are asked how to distribute on the 
interval [—1, 1] the points of division x,, •••, x m and how to choose the 
number K, so as to satisfy the équation 

f /(*) p(x)dx = K^fiXi), 
for every polynomial /(x) of degree m. 
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It can be shown that for p(x) = (1 —x*) -1 the problem is solved if 
K = n/m, and x, are the zéros of the CebySev polynomial cos m arc cos x 
(cf. §5). 

For/>(*) = 1, CebySev gave a solution of the problem for m = 1,2, ”,7. 
For m = 8 the problem has no solution: the points of division may be 
found but they are complex. For m = 9 it again has a solution. However, 
as S. N. BernSteln showed, for any m > 9 the problem has no solution 
(the points of division lie outside the interval [—1, +1]). 

A quadrature formula that is exact for polynomials of degree n can be 
constructed very simply by means of Lagrange’s formula (4). If we 
integrate its left and right sides on the interval [a, b], we obtain 

Ç P n (x)dx = J,pJ(x k \ (10) 

J n k-0 


where 


f* (*-Jfp) - U-**♦«)— (*-*«) dx 
' n (x k - x n ) ••• ( x k - x*_,Xjc* - x k+1 ) — (x k - x„) 

(k = 0 , 1 , 


Consequently, équation (10) is valid for ail polynomials of degree n, and 
thus the quadrature formula 

( /(*) dx 2) 

J » O 


is exact for ail polynomials of degree n. 
When 


a + b 


x 0 = a, 


*i = 


2 ’ 


x% = b, 


this formula reduces, as we hâve seen earlier, to Simpson’s formula. 

The distribution of the points of interpolation x k (k = 0, 1, •••, n) in 
the interval [a, b] may be changed. For every distribution of the points 
there will be a corresponding quadrature formula. 

Gauss, the famous German mathematician ofthe last century, showed 
that the interpolation points x k may be distributed in such a manner 
that the formula will be exact for ail polynomials not only of degree n, 
but also of degree 2n + 1. 

The polynomial 

4.«(*) = (* — *o)(* — *>)••■(* — x„) 
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of degree n + 1, arising from Gauss’s points of division x k , has a 
remarkable property: For any polynomial P(x) of degree less than n + I, 
we hâve the équation 

f 4„+ 1 (x) P(x) dx = 0. 

J a 

In other words, the polynomial A„ tl (x) is orthogonal on the interval 
[a, b] to ail polynomials of degree not greater than n. The polynomials 
A„^(x) we called the Legendre polynomials (corresponding to the interval 
[«. *])■ 


§4. The CebySev Concept of Best Uniform Approximation 

Statement of the question. tebySev came to the idea of best uniform 
approximation from a purely practical problem, since he was not only 
one of the greatest mathematicians of the last century, creating the basis 
for a number of mathematical disciplines that are widely developed at 
présent, but was also a leading engineer of his time. In particular, 
CebySev was very much interested in questions of the construction of 
mechanisms producing a given trajectory of motion. We will now explain 
this idea. 

Let the curve y = f(x) be given on the interval a ^ x < b. We wish 
to construct, subject to spécifie technical requirements, a mechanism such 
that a certain one of its points will describe this curve as exactly as 
possible when the mechanism is in operation. CebySev solved the problem 
as follows. First of ail, looking for the solution as an engineer, he 
constructed the required mechanism in such a manner as to get a rough 
approximation to the required trajectory. Thus, a certain point A of the 
mechanism, admittedly not yet in its final form, would describe the curve 

y = <Kx), (11) 

resembling the required curve y = f(x) only in its general features. The 
mechanism so constructed consists of separate parts, gears, levers of 
various kinds, and the like. Ail of these hâve spécifie measurements 

<*o .“i .“m . (12) 

which completely describe the mechanism, and consequently the curve (11). 
They are the parameters of the mechanism and of the curve (11).* Thus 


* Details of the calculations for mechanisms of this sort may be found in the pub¬ 
lication “The Scientific Heritage of P. L. CebySev,” Volume II, Academy of Sciences 
of the USSR, 1945. 
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the curve (11) dépends not only on the argument x, but also on the 
parameters (12). To any assigned System of values of the parameters will 
correspond a spécifie curve, whose équation may be conveniently written 
in the form 

y=<Hx;a 0 ,a, , ••• ,a m )- (13) 

It is customary to say in such cases that we hâve obtained a family of 
functions (13), defined on the interval a < x < b and depending on the 
m + 1 parameters (12). 

For the further solution of his problem CebySev worked as a pure 
mathematician. He proposed, in a perfectly natural way, to take as the 
measure of the déviation of the function f{x) from the approximating 
function (/>(*; a 0 , a,, , a,) the magnitude 

11/ - 4> Il = M 1 f(x) - -, a m )\, (14) 

equal to the maximum of the absolute value of the différence f(x) 
— <f>(x ; oq , a,, ••• , a m ) on the interval a ^ x < b (figure 8). This quantity 
is obviously a certain function 

II/-* Il = FK.c.,,-,0 (15) 

of the parameters «o, a,, •••, a m . The problem is now to find those 
values of the parameters for which the function (15) is a minimum. 
These values define a function 
<f>, which it is customary to 
describe as the best uniform 
approximation of the given 
function y = f(x) among ail 
possible functions of the given 
family (11). The magnitude 
FK. “i. , “m) for these val¬ 

ues of the parameters is called 
the best uniform approximation 
of the function f(x) on the 
interval [a, b] by means of the 
functions of the family (13). It Fig. 8. 

is usually denoted by the Sym¬ 
bol E„(f). The term “uniform” is often replaced, especially in non-Soviet 
literature, by the term “CebySev.” They both emphasize the spécifie 
character of the approximation, since other types of approximation are 
of course possible; for example, one may speak of the best approximation 
to f(x) by functions from a given family in the sense of the mean square. 
This subject will be discussed in §8. 
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CebySev first discovered the various laws which hold for the type of 
approximation we are discussing here and found that in many cases the 
function which is the best uniform approximation to f(x) on the interval 
[a, b] has the remarkable property that for it the maximum (15) of the 
absolute value of the différence 

f(x) —<f>(x; 

is attained for at least m + 2 points of the interval [a, b] with successively 
alternating signs (figure 9). 

We hâve no space here for an exact formulation of the conditions 
under which this proposition is valid and refer our better prepared readers 
to the article of V. L. GonCarov “The theory of the best approximation 
of functions” (“The Scientific Heritage of CebySev,” Volume I). 

The case of approximation of fonctions by polynomials. The cited in¬ 
vestigations of CebySev are especially important for the general theory 

of approximation when ap- 
plied to the question of 
approximating an arbitrary 
function f(x) on a given in¬ 
terval [a, b) by polynomials 
P n (x) = a 0 + fl,x + a 2 x* 
+ ••• + a n x n of given degree 
n. The polynomials P„(x) of 
degree n constitute a family 
of functions depending on 
the n -f 1 coefficients as pa- 
rameters. As may be shown, 
the theory of CebySev is fully applicable to polynomials, so that if we 
wish to make the best uniform approximation to the function f(x) on 
the segment [a, b] by a polynomial P n (x) chosen from ail possible poly¬ 
nomials of the given degree n, then we need only find ail those values 
of x on this interval for which the function |/(a:)— P„(x) \ assumes its 
maximum L on [a, b]. If among them we can find n + 2 values 
x y , Xi ,... , x„+ 2 , such that the différence f(x) — P„(x) successively 
changes sign 

f(x,)-P„(Xy)= ± L, 
f(x 2 ) - />„(**) = ± L, 



/(*„ +î ) - P«(* n -z) = ± (-D" +1 L, 
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then P n (x) is the best polynomial, and otherwise not. For example, the 
solution of the problem of best 
uniform approximation by poly- 
nomials P t (x) = p + qx of the 
first degree to the function f(x) 
illustrated in figure 10 consists of 
the polynomial p 0 + q^x whose 
graph is a straight line parallel to 
the chord AB and dividing into 
equal parts the parallelogram 
encloscd between the chord and 
the tangent CD to the curve 
y = f(x) which is parallel to that chord, since the absolute value of the 
différence f(x) — (p 0 + q„x) obviously assumes its maximum for the 
values x a = a, x,, and x 2 = b, where x, is the abscissa of the point of 
tangency F, and for these values the différence itself successively changes 
sign. To avoid misunderstanding, we note that we are speaking of a 
curve that is convex downward and has a tangent at every point. In this 
example £,(/) is equal to half the length of any one of the (equal) segments 
AC, BD, or GF. 

§5. The CebySev Polynomials Deviating Least from Zéro 

Let us consider the following problem. It is required to find a polynomial 
P„~i(x) of degree n — 1 which is the best uniform approximation on the 
interval [—1, 1] to the function x n . 

It turns out that the desired polynomial satisfies the équation 

x n — P„-i(x) = * cos n arc cos x. (16) 

2 n 1 

This fact follows directly from CebySev’s theorem, if we prove, first that 
the right side of (16) is an algebraic polynomial of degree n with the 
coefficient of x " equal to one; second, that its absolute value on the 
interval (—1, +1] assumes its maximum, equal to L = 1/2"* *, at the 
n + 1 points x k = cos kir In (k = 0, 1, •••, n); and third, that it changes 
sign successively at these points. 

The fact that the right side of (16) is a polynomial of degree n with 
coefficient of x n equal to one may be proved as follows. 

Let us assume that for a given natural number n we hâve already 
proved that 

cos n arc cos x = 2 B-1 [x n — (?„-!(*)]'• 

— Vl — x 2 sin n arc cos x — 2 B-1 [x n+1 — Q„(x)], 
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where and Q n are algebraic polynomials of degree n — I and n, 
respectively. Then similar équations will also be valid for n + 1, as is 
easily established by considération of the following formulas: 

cos (n + 1) arc cos x = x cos n arc cos x — V\ — x 2 sin n arc cos x ; 

— Vl — x 2 sin (n + 1) arc cos x 

= — x V\ — x 2 sin n arc cos x -F (x* — 1 ) cos n arc cos x. 

But our équations for n = 1 are true, since 

cos arc cos x = x, 

— V 1 — x 1 sin arc cos x = x 1 — 1. 

Consequently, they are true for any n. 

The right side of (16) is called the Cebyiev polynomial of degree n 
devialing least from zéro, since Cebyiev was the first to State and solve 
this problem. The first few of these polynomials are 

U*) = I, 

T,(x) = x, 

U x) = fax 2 - I), 

T 3 (x) = i(4*>-3*), 

Ux) = |(8*«-8x*+ I), 

T„(x) = ^(I6x 5 — 20X 3 + 5x). 

We hâve already seen the important rôle of the Cebyiev polynomials 
in questions of interpolation and of approximate methods of intégration. 
Let us make some further remarks on interpolation. 

From the fact that the différence f(x) — P n (x) between an arbitrary 
function /(x) and its best approximating polynomial P n (x) changes sign 
at n + 2 points, it follows from the properties of continuous functions 
that P„(x) agréés with f(x) at n + 1 spécifie points of the interval [a, b]-, 
i.e., P n (x) is an interpolating polynomial of degree n for f(x) with a certain 
choice of points of interpolation. 

In this way the problem of the best uniform approximation of a con¬ 
tinuous function y(x) becomes one of choosing, on the interval [—1, 1], 
a system x 0) x,, •••, x„ of points of interpolation such that the cor- 
responding interpolating polynomial of degree n will hâve a déviation 
11/ — Q II = max z f(x) — Q(x) of least possible value. Unfortunately, the 
required points of division are often difficult to find in practice. Usually 
it is necessary to solve the problem in some approximate way, and here 
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the CebySev polynomials play a spécial rôle. It turns out that if, in par- 
ticular, the points of interpolation are taken to be zéros of the polynomial 
cos (n + 1) arc cos x (i.e., the points where this polynomial is equal to 
zéro), then the corresponding interpolating polynomial, at least for large n, 
will give a uniform déviation from the function (if it is sufficiently smooth) 
which differs little from the corresponding déviation of the best uniform 
approximation to the function by a polynomial. The somewhat vague 
expression “differs little” can be replaced, in a number of important 
characteristic cases, by very exact quantitative estimâtes, which we will 
not establish here. 

Returning to the CebySev polynomial, let us consider it in the form 
T„(x) = M cos n arc cos x (—1 ^ x «ç 1), where M is some positive 
number. Obviously, on the interval [—1, 1] its absolute value does not 
exceed the number M. Its dérivative is 

. nM sin n arc cos x 

T„(x) =- . -, 

V1 - x* 

which on the interval [—1, 1] satisfies the inequality 

Vl — x* 

It turns out that this inequality is true for ail polynomials P„(x) of degree n 
which do not exceed the number M in absolute value on the interval 
[—1, 1]; i.e., for the dérivative of any such polynomial on the interval 
[—1, 1] we hâve the inequality 


Pn(x) 1 ^ 


n M 


VI -x* 


This inequality is to be credited to A. A. Markov, since it follows 
directly from results of his which even go somewhat further. Markov 
himself obtained it in connection with a question suggested to him by 
D. I. Mendeleev. 

In 1912, S. N. BernSteln obtained a similar inequality, which bears 
his name, for trigonométrie polynomials and by using these inequalities 
first showed how to establish the differentiability properties of a function 
if one knows how fast it is approached by its sequence of best approxi¬ 
mations. Results of this kind concerning différentiable functions are given 
in §§6 and 7. 



288 


Xri. APPROXIMATIONS O F FUNCTIONS 


§6. The Theorem of Weierstrass; the Best Approximation to a 
Fonction as Related to Its Properties of Differentiability 

The Weierstrass theorem. If we apply the general définition, given in 
§4, of best approximation to a function to the case of approximating 
polynomials, we are led to the following définition. The best uniform 
approximation to the function f(x) on the interval [a, b] by polynomials 
of degree n occurs when the (nonnegative) number £„(/), is equal to the 
minimum of the expression 

max \/{x) - P n (x)\ = \\f-P n II, 


taken over ail possible polynomials P„(x) of degree n. 

Independently of whether or not we are able to find the exact poly¬ 
nomial that best approximates the given function /(*), it is of great 
practical and theoretical interest to estimate the quantity £„(/) as closely 
as possible. In fact, if we wish to approximate the function / by a poly¬ 
nomial with accuracy 8, in other words, in such a way that 

\f(x)-P n (x)\ < 8 (17) 

for ail x in the given interval, then there is no sense in choosing it from 
the polynomials of degree n for which £„(/) > 8, since for this n there 
will certainly not be any polynomial P„ for which (17) holds. On the 
other hand, if it is known that £„(/) < 8, then it makes sense for such 
n to look for a polynomial P„(x) which will approximate f(x) with 
accuracy 8, since such polynomials evidently exist. 

The properties of the best approximating functions of various classes 
hâve been the subject of deep and careful study. First of ail we note the 
following important fact. 

If a function f(x) is continuons on the interval [a, b], then its best 
approximation E„(f) tends to zéro as n increases to infinity. 

This is the theorem proved by Weierstrass at the end of the last century. 
It has great significance, since it guarantees the possibility of approximating 
an arbitrary continuous function by a polynomial with any desired 
accuracy. As a resuit, the set of ail polynomials of any degree bears to 
the set of ail continuous functions defined on the interval exactly the 
same relation as the collection R of rational numbers bears to the collection 
H of ail real (rational and irrational) numbers. In fact, for every irrational 
number a and arbitrarily small positive number e, one can always find 
a rational number r satisfying the inequality | a — r | < e. On the other 
hand, if f(x) is a function continuous on [a, b] and e is an arbitrarily 
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small positive number, then by Weierstrass’s theorem there will exist an 
algebraic polynomial P„(x) such that for ail x from the interval [a, fi] 
we hâve |/(x) — /*„(jc)| < e. Consequently, the best approximation 
£„(/) to a continuons function tends to zéro for n -* <x>. 

Let us illustrate the theorem of Weierstrass in the following way. 
Given the graph of an arbitrary continuous function (figure 9) defined 
on the interval [a, b], and an arbitrarily small positive number «, let us 
surround our graph with a strip of height 2e in such a way that the graph 
passes through the center of the strip. Then it is always possible to choose 
an algebraic polynomial 

Pn(x) = fl 0 + a x x + - + a n x n , 

of sufficiently high degree such that its graph lies entirely inside the strip. 

We make the following remark. As before, let f(x) be an arbitrary 
function continuous on [a, b], and let P„(x) (n = 1, 2, •■•) be the poly¬ 
nomial s which are the best uniform approximation to it. It is easy to see 
that the function f(x) may be represented in the form of a sériés 
/(*) = PM + [/>*(*)- /y*)] + [/> 3 (x)- PM) + - , which is uni- 
formly convergent to f(x) on [a, fi]. This follows from the fact that the 
sum of the first n terms of the sériés is equal to P n (x), and 

max \Ax) - PM\ = £,(/), 


while £„(/) -» 0 as n -*■ oo. 

As a resuit we hâve a new formulation of Weierstrass’s theorem: 

Every function continuous on the interval [a, fi] may be represented by 
a sériés of algebraic polynomials converging uniformly to the function. 

This resuit has great theoretical significance. It guarantees the possibility 
of representing an arbitrary continuous function, however originally 
given (for example, by means of a graph), in the form of an analytic 
expression. (By an analytic expression we mean an elementary function 
or else a function derived from a sequence of elementary functions by 
means of a limit process.) Historically this resuit finally destroyed the 
notion of analytic expression that had existed in mathematics almost up 
to the middle of the last century. We say “finally,” since Weierstrass’s 
theorem had been preceded by a sériés of general results of similar type, 
relating chiefly to Fourier sériés. Until these results were obtained, it had 
been assumed that analytic expressions were the means of representing 
the especially désirable properties that were characteristic of analytic 
functions. For example, it was usually taken for granted that analytic 
expressions were infinitely différentiable and could even be expanded in 
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power sériés. But these ideas ail proved to be without foundation. A 
function may hâve no dérivative anywhere in its interval of définition 
and yet be represcntable by an analytic expression. 

Fom a methodological point of view, the value of this discovery lies 
in the fact that it enables us to realize with complété clarity that at least 
in principle the methods of mathematics are applicable to an immeasurably 
wider class of laws than had been realized before. 

At the présent time many different proofs of Weierstrass’s theorem are 
known. For the most part they reduce to the construction of a sequence 
of polynomials for a given continuous function f, which approximate / 
uniformly as their degree increases. The simply constructed polynomial 

fl,W = |c***(l-x)-*4) 1 


will approximate a continuous function f(x) on the interval [0, 1]. It is 
called the Bernstein polynomial. With increasing n this polynomial 
converges uniformly on the interval [0, 1] to the function which generated 
it. * Here C* is the number of combinations of n éléments taken A: at a time. 

We note that a theorem similar to Weierstrass’s holds in the complex 
domain. Exhaustive results in this direction are due to M. A. Lavrent’ev, 
M. V. KeldyS, and S. N. Mergeljan. 

The connection between the order of the best nniform approximation of 
a fonction and its differentiability properties. We note further the fol- 
lowing results. If a function f(x) on the interval [a, b) has a dérivative 
/ lrl (x) of order r which does not exceed the number K in absolute value, 
then its best approximation £„(/) satisfies the inequality 

( 18 ) 

where c r is a constant, depending only on r (Jackson’s theorem). From 
inequality (18) it can be seen that with increasing n the quantity £„(/) 
converges to zéro more rapidly for functions with dérivatives of higher 
order. In other words, the better (smoother) the function, the faster the 
convergence to zéro of its best approximation. BernSteln proved that in 
a certain sense the converse to this proposition is also true. 

Still better in this respect than the différentiable functions are the 


* lt must be remarkcd that, in spite of their simplicity, the Bernstein polynomials 
are little used in practice. The explanation is that they converge very slowly, even for 
functions with good differentiability properties. 
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analytic functions. BernSteln proved that for such functions, £„(/) satisfies 
the inequality 

E n {f) ^ cq ", (19) 

where c and q are constants depending on the function /, and 0 < q < 1 ; 
i.e., £„(/) converges to zéro more rapidly than a certain decreasing 
progression. He also proved that conversely the inequality (19) implies 
that the function / is analytic on [a, b]. 

We hâve given certain very important results that were discovered at 
the beginning of this century and hâve been characteristic of the direction 
taken by contemporary research in the theory of approximation of 
functions. The practical value of these results may be seen from the 
following example. 

If Qn(x) is a polynomial of degree n, which interpolâtes the function 
/( x) on the interval [—1, 1] at the n + 1 points of interpolation which 
are the zéros of the CebySev polynomial cos (n + 1) arc cos x, then on 
this interval one has the inequality |/(x)— Q„(x)\ < c ln n E„(f), where 
c is a constant independent of n, and £„(/) is the best approximation 
to the function / on [—I, 1]. In this inequality we may replace £„(/) 
by the larger expressions, occurring in (18) or (19), provided / is suf- 
ficiently smooth, and obtain a good estimate of the approximation of our 
interpolating polynomial. Since In n increases very slowly with increasing 
n, the order of the estimate in the given case differs little from the order 
of convergence to zéro of £„(/). The advantage of interpolation by the 
CebySev points consists of the fact that for other points of interpolation 
the factor c ln n in the corresponding inequality is replaced by a more 
rapidly increasing factor; this is particularly true in the case of equally 
spaced points of interpolation. 


§7. Fourier Sériés 

The origin of Fonrier sériés. Fourier sériés arose in connection with 
the study of certain physical phenomena, in particular, small oscillations 
of elastic media. A characteristic example is the oscillation of a musical 
string. Indeed, the investigation of oscillating strings was the origin 
historically of Fourier sériés and determined the direction in which 
their theory developed. 

Let us consider (figure 11) a tautly stretched string, the ends of which 
are fixed at the points x = 0 and x = I of the axis Ox. If we displace 
the string from its position of equilibrium, it will oscillate. 

We will follow the motion of a spécifie point of the string, with abscissa 
x„. Its déviation verticaliy from the position of equilibrium is a function 
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<f>(t) of time. It can be shown that one can always give the string an initial 
position and velocity at / = 0 such that as a resuit the point which we 
hâve agreed to follow will perform harmonie oscillations in the vertical 
direction, defined by the function 

<j> = i/>(/) = A cos a kl + B sin a kl. (20) 



Here a is a constant depending only on the physical properties of the 
string (on the density, tension, and length), k is an arbitrary number, 
and A and B are constants. 

We note that our discussion relates only to small oscillations of the 
string. This gives us the right to assume approximately that every point 
x 0 is oscillating only in the vertical direction, displacements in the 
horizontal direction being ignored.* We also assume that the friction 
arising from the oscillation of the string is so small that we may ignore it. 
As a resuit of these approximate assumptions, the oscillations will not 
die out. 

The possibilities of oscillation for the point x 0 are of course, not 
exhausted by the periodic motions defined by the harmonie functions (20), 
but these functions do hâve the following remarkable property. Experi- 
ments and their accompanying theory show that every possible oscillation 
of the point x 0 is the resuit of combining certain harmonie oscillations 
of the form (20). Relatively simple oscillations are obtained by combining 
a finite number of such oscillations; i.e., they are described by functions 
of the form 

n 

<f>(t) = A„ + V (A k cos edet + B k sin akt), 

t -1 


* This question is directly connected with the difTerential équation of the oscillating 
string 



which was discussed in Chapter VI. 
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where A k and B k are corresponding constants. These functions are called 
trigonométrie polynomials. In more complicated cases, the oscillation will 
be the resuit of combining an infinité number of oscillations of the form 

(20) , corresponding to k = 1, 2, 3, ••• and with suitably chosen constants 
A k and B k , depending on the number k. Consequently, we arrive at the 
necessity of representing a given function </>(/) of period 2ir/a, which 
describes an arbitrary oscillation of the point x 0 in the form of a sériés 

00 

<f>(t) = A 0 + ^ (/4*cos oJct + B k sin akt). (21) 

There are many other situations in physics where it is natural to 
consider a given function, even though it does not necessarily describe 
an oscillation, as the sum of an infinité trigonométrie sériés of the form 

(21) . Such a case arises, for example, in connection with the vibrating 
string itself. The exact law for the subséquent oscillation of a string, 
to which at the beginning of the experiment we hâve given a spécifie 
initial displacement (for example, as illustrated in figure 12) is easy to 
calculate, provided we know the expansion in a trigonométrie sériés 
f(x) = sr °k sin (kn/I)x, (a par- 


ticular case of the sériés (21)), of 
the function f(x) describing the 

t n t ♦ i a 1 n a nti a n 

y 


v _ , 

initiai position. — 



l 


Expansion of fonctions in a trigo¬ 
nométrie sériés. On the basis of Fig. 12. 

what has been said there arises the 

fundamental question: Which functions of period 2n/a can be represented 
as the sum of a trigonométrie sériés of the form (21)? This question was 
raised in the 18th century by Euler and Bernoulli in connection with 
Bernoulli’s study of the vibrating string. Here Bernoulli took the point 
of view suggested by physical considérations that a very wide class of 
continuous functions, including in particular ail graphs drawn by hand, 
can be expanded in a trigonométrie sériés. This opinion received harsh 
treatment from many of Bernoulli’s contemporaries. They held tenaciously 
to the idea prévalent at the time that if a function is represented as an 
analytic expression (such as a trigonométrie sériés) then it must hâve 
good differentiability properties. But the function illustrated in figure 12 
does not even hâve a dérivative at the point f ; in such a case, how can it 
be defined by one and the same analytic expression on the whole interval 
[ 0 ,/]? 

We know now that the physical point of view of Bernoulli was quite 
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right. But to put an end to the controversy it was necessary to wait an 
entire century, since a full answer to these questions required first of ail 
that the concepts of a limit and of the sum of a sériés be put on an exact 
basis. 

The fundamental mathematical investigations confirming the physical 
point of view but based on the older ideas concerning the foundations 
of analysis were completed in 1807-1822 by the French mathematician 
Fourier. 

Finally, in 1829, the German mathematician Dirichlet showed, with 
ail the rigor with which it would be done in present-day mathematics,. 
that every continuous function of period 2n/a,* which for any one period 
has a finite number of maxima and minima, can be expanded in a unique 

trigonométrie Fourier sériés, 
uniformly convergent to 
function. 

Figure 13 illustrâtes a 
function satisfying Dirich- 
let’s conditions. Its graph 
is continuous and periodic, 
with period 2n, and has one 
maximum and one minimum 
in the period 0 C x ^ 2. 



Fourier coefficients. In what follows we will consider functions of 
period 2 n, which will simplify the formulas. We consider any continuous 
function f(x) of period 2-n satisfying Dirichlet’s condition. By Dirichlet’s 
theorem it may be expanded into a trigonométrie sériés 


/(*) = 


00 

4- V. fat 


( 22 ) 


which is uniformly convergent to it. The fact that the first term is written 
as aJ2 rather than a 0 has no real significance but is purely a matter of 
convenience, as we shall see later. 

We pose the problem: to compute the coefficients a k and b k of the 
sériés for a given function/(*). 


* The function f(x) has period u> if it satisfies the équation f(x + u> ) -Ax). 
t In fact, Dirichlet's theorem also applies to a certain class of discontinuous functions, 
the so-called functions of bounded variation. For discontinuous functions, of course, 
the corresponding sériés is nonuniformly convergent. 
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To this end we note the following équation: 

f cos kx cos Ixdx — 0 (k # /; k, I = 0, 1, •••), 

J -v 

f sin kx sin Ix dx = 0 (* ^ /;*,/ = 0, 1, •••), 

■' —» 

| sin kx cos Ix dx = 0 (k,l = 0,1,2,—), 

f cos 2 kx dx = tt (*=1,2,—), 

—« 

J" sin 2 kx dx = tt (k = 1,2,—), 


(23) 


which the reader may verify. These intégrais are easy to compute by 
reducing the products of the various trigonométrie functions to their 
sums and différences and their squares to expressions containing the 
corresponding trigonométrie functions of double the angle. The first 
équation States that the intégral, over a period of the function, of the 
product of two different functions from the sequence 1, cos x, sin x, 
cos 2x, sin 2x, ••• is equal to zéro (the so-called orthogonality property 
of the trigonométrie functions). On the other hand, the intégral of the 
square of each of the functions of this sequence is equal to n. The first 
function, identically equal to one, forms an exception, since the intégral 
of its square over the period is equal to 27r. It is this fact which makes 
it convenient to write the first term of the sériés (22) in the form aJ2. 

Now we can easily solve our problem. To compute the coefficient a m , 
we multiply the left side and each term on the right side of the sériés (22) 
by cos mx and integrate term by term over a period 27r, as is permissible 
since the sériés obtained after multiplication by cosmx is uniformly 
convergent. By (23) ail intégrais on the right side, with the exception 
of the intégral corresponding to cos mx, will be zéro, so that obviously 


hence 


c 


f(x) co s mx dx = a m n. 


1 f" 

a m = - f(x) cos mx dx 

IT J 


(m = 0,1,2, •). 


(24) 


Similarly, multiplying the left and right sides of (22) by sin mx and 
integrating over the period, we get an expression for the coefficients 
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and we hâve solved our problem. The numbers a m and b m computed by 
formulas (24) and (25) are called the Fourier coefficients of the function 
/(*)• 

Let us take an example the function /(x) of period 2 n illustrated in 
figure 13. Obviously this function is continuous and satisfies Dirichlet’s 
condition, so that its Fourier sériés converges uniformly to it. 

It is easy to see that this function also satisfies the condition /(—x) 
= — /(x). The same condition also clearly holds for the function F,(x) 
= /(x) cos mx, which means that the graph of F,(x) is symmetric with 
respect to the origin. From géométrie arguments it is clear that 
Fj(x) dx = 0, so that a m — 0 (m = 0, 1, 2, •••)■ Further, it is not 
difficult to see that the functions F 2 (x) — /(x) sin mx has a graph which 
is symmetric with respect to the axis Oy so that 

b m = - f F^x) dx = - f F 2 (x) dx. 

7T J TT J © 

But for even m this graph is symmetric with respect to the center n/2 of 
the segment [0, n], so that b m = 0 for even m. For odd m = 21 = 1 
(/ = 0, 1, 2, •••) the graph of F 3 (x) is symmetric with respect to the straight 
line x = n/2, so that 

= - J Fj(x)</x. 

n J o 

But, as can be seen from the sketch, on the segment [0, 7 r/ 2 ] we hâve 
simply /(x) = x, so that by intégration by parts, we get 

4 f»/* 4 1 _ni 

i,sin ( 2 ' + 1)if ‘' Jf - ^ + V ' 

and consequently 

ftr\ = 1 V (-»)' sin (21 + l)x 

«U (2/+D* 

Thus we hâve found the expansion of our function in a Fourier sériés. 

Convergence of the Fonder partial snms to the generating fnnetion. 

In applications it is customary to take as an approximation to the function 
/(x) of period 2 n the sum 
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of the first n terms of its Fourier sériés, and then there arises the question 
of the error of the approximation. If the function f(x) of period 2n has 
a dérivative f iT> (x) of order r which for ail x satisfies the inequality 
|/ (, '(x)| ^ K, then the error of the approximation may be estimated as 
follows: 

\f(x)-S n (x)\^ C - ï-~ , 

n T 

where c r is a constant depending only on r. We see that the error converges 
to zéro with increasing n, the convergence being the more rapid the more 
dérivatives the function has. 

For a function which is analytic on the whole real axis there is an even 
better estimate, as follows: 

\f(x)-S n (x)\ < cq n , (26) 

where c and q are positive constants depending on / and q < 1. It is 
remarkable that the converse is also true, namely that if the inequality 
(26) holds for a given function, then the function is necessarily analytic. 
This fact, which was discovered at the beginning of the présent century, 
in a certain sense reconciles the controversy between D. Bernoulli and 
his contemporaries. We can now State: If a function is expandable in a 
Fourier sériés which converges to it, this fact in itself is far from implying 
that the function is analytic; however, it will be analytic, if its déviation 
from the sum of the first n terms of the Fourier sériés decreases more 
rapidly than the terms of some decreasing géométrie progression. 

A comparison of the estimâtes of the approximations provided by the 
Fourier sums with the corresponding estimâtes for the best approximations 
of the same functions by trigonométrie polynomials shows that for 
smooth functions the Fourier sums give very good approximations, 
which are in fact, close to the best approximations. But for nonsmooth 
continuous functions the situation is worse: Among these, for example, 
occur some functions whose Fourier sériés diverges on the set of ail 
rational points. 

It remains to note that in the theory of Fourier sériés there is a question 
which was raised long ago and has not yet been answered: Does there 
exist a continuous periodic function f(x) whose Fourier sériés fails for 
ail x to converge to the function as n = oo ? The best resuit in this direction 
is due to A. N. Kolmogorov, who proved in 1926 that there exists a 
periodic Lebesgue-integrable function whose Fourier sériés does not 
converge to it at any point. But a Lebesgue-integrable function may be 
discontinuous, as is the case with the function constructed by Kolmogorov. 
The problem still awaits its final solution. 
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To provide approximations by trigonométrie polynomials to arbitrary 
continuons periodic fonctions, the methods of the so-called summation 
of Fourier sériés are in use at the présent time. In place of the Fourier 
sums as an approximation to a given function we consider certain 
modifications of them. A very simple method of this sort was proposed 
by the Hungarian mathematician Fejér. For a continuous periodic 
function we first, in a purely formai way, construct its Fourier sériés, 
which may be divergent, and then form the arithmetic means of the 
first n partial sums 

o n (x) = So(*) + SA*) + - + S n (x) (27) 

n + 1 

This is called the Fejér sum of order n corresponding to the given function 
/(x). Fejér proved that as n = oo this sum converges uniformly to /(x). 

§8. Approximation in the Sense of the Mean Sqnare 

Let us return to the problem of the oscillating string. We assume that 
at a certain moment /„ the string has the form y = /(x). We can prove 
that its potential energy W, i.e., the work made available as it moves 
from the given position to its position of equilibrium, is equal (for small 
déviations of the string) to the intégral W = /*/'*(•*) dx, at least up to 
a constant factor. Suppose now that we wish to approximate the function 
/(x) by another function <£(x). Together with the given string, we will 
consider a string whose shape is defined by<f>(x), and still a third string, 
defined by the function /(x)— <f>(x). It may be proved that if the energy 

f[f'(x)-^(x)Ydx (28) 

of the third string is small, then the différence between the energy of the 
first two strings will also be small * Thus, if it is important that the 
second string hâve an energy which differs little from the first, we must 


* ln fact, if 
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try to find a function </>'(*) for which the intégral (28) will be as small 
as possible. We are thus led to the problem of approximation to a function 
(in this case /'(*)) in the sense of the mean square. 

Here is how this problem is to be stated in the general case. On the 
interval [a, b] we are given the function ^(x), and also the function 

0(x;a o ,a, , ••• ,a„), (29) 

depending not only on x but also on the parameters a 0 ,<*,,•••, a*. It is 
required to choose these parameters in such a way as to minimize the 
intégral 

f [f(*) - #(■*; «o. “i. "•> “«)]* dx. (30) 

J a 

This problem is very similar in idea to CebySev’s problem. Here also 
the idea is to find the best approximation of the function F(x) by functions 
of the family (29), but only in the sense of the mean square. It is now 
unimportant for us whether or not the différence F—<P is small for ail 
values of x on the interval [û, 6J; on a small part of the interval the 
différence F — <P may even be large provided only that the intégral (30) 
is small, as is the case, 
for example, for the 
two graphs illustrated in 
figure 14. The smallness 
of the quantity (30) shows 
that the functions F and 
0 are close to each other 
on by far the greater part 
on the interval.* As to 
the choice in practice of 
one method of approxi¬ 
mation or another, everything dépends on the purpose in view. In the 
earlier example of the string, it is natural to approximate the function 
/'(x) in the sense of the mean square. On the other hand, the method 
of mean squares was unsatisfactory for CebySev in solving his problems 
in the construction of mechanisms, since a machine component projecting 
beyond the limits of tolérance, even if only over a very small part of the 
machine, would be quite intolérable: One such projection would spoil 
the whole machine. Thus Cebysev had to develop a new mathematical 
method corresponding to the problem which confronted him. 

* ln Chapter XIX we will see that there is a profou nd analogy between the close- 
ness of the functions in the sense of the mean square and the distance between points 
in ordinary space. 
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We should State that from the computational point of view the method 
of the mean square is more convenient, since it can be reduced to the 
application of well-developed methods of general analysis. 

As an example let us consider the following characteristic problem. 

We wish to make the best approximation in the sense of the mean 
square to a given continuous function f(x) on the interval [a, b] by sums 
of the form 

n 

X a *Mx)> 

i 

where the a k are constants and the functions <f> k (x) are continuous and 
form an orthogonal and normal System. 

This last means that we hâve the lollowing équations: 

)fafadx = 0 k — / (*,/ = 1,2, 
f a 4>ldx = 1 (*= 1,2. n). 

Let us introduce the numbers 

o* = f A x ) M x ) dx (k = 1,.... n). 

J a 

These numbers a k are called the Fourier coefficients of/ with respect to 
th e<f >„. 

For arbitrary coefficients a k , on the basis of the properties of orthogo- 
nality and normality of <f>„, we hâve the équation 

/ “ X dx = j fdx + 2) “* ~ 2 X 

= (J fdx - X oî) + X _ ük)î - 

a 1 ' I 

The first term on the right side of the derived équation does not dépend 
on the numbers a* . Thus the right side will be smallest for those a k 
which make the second term itself small, and obviously this can happen 
only if the numbers a k are equal to the corresponding Fourier coefficients 
a k . 

Thus we hâve reached the following important resuit. If the functions 
4> k form an orthogonal and normal system on the interval [a, />], then the 
sum 2," at k <f> k (x) will be the best approximation, in the sense of the mean 
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square, to the function /( x) on this interval if and only if the numbers a* 
are the Fourier coefficients of the function / with respect to <f>ic(x). 

On the basis of équation (23) it is easily established that the functions 

1 cos x sin x cos 2x 
'/Itt ‘\Zn yfn \Zn 

form an orthogonal and normal System on the interval [0, 2n]. Thus the 
stated proposition, as applied to the trigonométrie functions, will hâve 
the following form. 

The Fourier sum S„(x), computed for a given continuous function f(x) 
of period 2n, is the best approximation, in the sense of the mean square, 
to the function f{x) on the interval [0, 2w], among ail trigonométrie 
polynomials 

n 

t«(x) = «0 + 2) K COS kx + p k sin kx) 

of order n. 

From this resuit and from Fejér’s theorem, formulated in §7, we are 
led to another remarkable fact. 

Let /(x) be a continuous function of period 2 n and o„(x) be its Fejér 
sum of order n, defined in §7 by équation (27). 

We introduce the notation 

max |/(x) —cr n (x)| = r)„ . 

Since the Fourier sums 5*(x) (A: = 0, 1,... , n) are trigonométrie poly¬ 
nomials of order k ^ n, it is obvious that a n (x) is a trigonométrie poly¬ 
nomial of order n. Thus from the minimal property of the sum S„(x) 
shown previously, we hâve the inequality 

P [/(*) - S„(*)J* dx < f [f(x) - o n (x)] 2 dx 4 : f r,\dx = 2^ . 

Since, by Fejér’s theorem, the quantity rj n converges to zéro for n -*■ oo 
we obtain the following important resuit. 

For any continuous function of period 2v we hâve the équation 

l'JS /_ l-Æ*) _ ^n(AT)] 2 dx = 0. 

In this case we say that the Fourier sum of order n of a continuous function 
f(x) converges to f(x) in the sense of the mean square, as n increases 
beyond ail bounds. 
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In fact, this statement is true for a wider class of functions, namely 
those which are intégrable, together with their square, in the sense of 
Lebesgue. 

We will stop here and will not présent other interesting facts-from the 
theory of Fourier sériés and orthogonal functions, based on approximation 
in the sense of the mean square- Important physical applications of 
orthogonal Systems of functions hâve already been introduced in Chapter 
VI. Finally, we note that these questions are also discussed from a some- 
what different point of view in Chapter XIX. 
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APPROXIMATION METHODS 
AND COMPUTING TECHNIQUES 


§1. Approximation and Nnmerical Methods 

Characteristic pecnliarities of approximation methods. In many cases 
the application of mathematics to the study of events in the outside world 
is based on the fact that the laws governing these events hâve a quantitative 
character and can be described by certain formulas, équations, or in- 
equaltities. This allows us to investigate the events numerically and to 
make the calculations which are so necessary in practical life. 

As soon as a quantitative law has been found, purely mathematical 
methods may be used to investigate it. For definiteness, let us take some 
law which is described by an équation. This may be the law of motion 
of a body in Newtonian mechanics, the law of heat conduction or the 
propagation of electromagnetic oscillations, and so forth. Such équations 
are discussed in detail in Chapters V and VI. Usually the équation has 
adjoined to it certain conditions which its solution must satisfy (in 
Chapters V and VI these are the boundary and initial conditions) and 
which define a unique solution. 

The first and most important mathematical tasks here will be the 
following: 

1. To establish the existence of a solution. Even if it seems obvious 
from the physical point of view that the problem has a solution, a 
mathematical proof of the solvability of a rigorously formulated problem 
is usually considered as the necessary evidence that the mathematical 
formulation ot the problem is a satisfactory one. In a wide class of prob- 
lems it is possible to establish mathematically the existence of a solution. 

2. To attempt to find an explicit expression or formula for the quantity 
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which characterizes the event under considération. Usually such an 
expression can be found only in the simplest cases. It often happens 
that the explicit expression obtained is so complicated that to make use 
of it for the desired numerical results is very difficult or even impossible. 

3. To find a procedure for constructing an approximation formula, 
which gives a solution with any desired degree of accuracy. This can be 
done in many cases. 

4. But very often it will be possible to find one or more methods for 
direct numerical calculation of the solution. 

The development of such numerical methods (many of which are 
approximate) of solving problems of science and technology has produced 
a particular branch of mathematics that at the présent time is usually 
called mathematics of computation. 

The methods of computational mathematics are naturally approxi¬ 
mative, since every quantity is computed only to a certain number of 
significant figures; for example, to five, six, etc., décimal places. 

For applications this is sufficient, since knowing the exact value of 
any quantity is often unnecessary. In technical questions, for example, 
the desired quantity usually serves to define the dimensions or other 
parameters of a manufactured article. Every manufacturing process is 
only approximate, so that technical computations with an exactness 
which goes beyond the allowed “tolérances” are obviously valueless. 

So for computational purposes there is no need of exact formulas or 
of exact solutions of équations. Exact formulas and équations may be 
replaced by others that are admittedly inexact, provided they are close 
enough to the original ones that the error produced by such a change 
does not exceed given bounds. 

Later we shali return to this question of replacing one problem by 
another. At the moment, however, we merely wish to emphasize the first 
characteristic feature of computational methods, namely that by their 
very nature they can, as a rule, produce only approximate results; but 
then only such results are needed in practice. 

We now turn our attention to a second aspect of computational 
methods in mathematics. In any computation we can operate with only 
a finite number of digits and obtain ail the results after a finite number 
of arithmetic operations. If we perform the computations according to 
some formula, then the latter must first hâve been transformed in such 
a way that it involves only a finite number of terms with a finite number 
of parameters. It is known, for example, that many functions may be 
represented as the sum of a power sériés 


f(x) = C 0 + C t X + CtX 2 + — • 


( 1 ) 
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Thus, the function sin x, where x is the radian measure of an angle, may 
be expanded in the power sériés 



To find the exact value of f(x), we would need to sum up “ail” the 
terms of the sériés (1), but generally speaking, this is impossible. To find 
f\x) approximately, it is sufficient to take only a certain finite number 
of terms of the sériés. For example, it may be proved that to compute 
sin x with an accuracy of 10 -s for an angle from zéro to half a right angle 
it is sufficient to take the terms through x 5 , so that sin x is replaced by 
the polynomial 

1! 3! 5! ■ 

For the numerical solution of a problem of mathematical analysis that 
consists of determining some function, we must by one means or another 
replace this problem by the problem of finding certain numerical para- 
meters, the knowledge of which enables us to make an approximate 
computation of the unknown function. We will illustrate this by an 
example. 

Let it be required to solve, on the interval a < x ^ b, the boundary- 
value problem for the differential équation 

Uy) -f(x) =y+ p(x) y' + q(x) y -f(x) = 0 (2) 

with boundary conditions y(a) = 0, y(b) = 0. In one of the possible 
methods of solution, namely Galerkin’s method, we start with a System 
of linearly independent functions tu,(x), tu 2 (x), •••, which satisfy the 
boundary conditions (Chapter VI, §5). This System is so chosen as to be 
“complété” in the sense that a function which is intégrable on [a, b] and 
is orthogonal to ail the tu* (k — 1,2, •••) will be equal to zéro at ail (more 
exactly, at “almost ail”) points of the interval. The condition that y(x) 
satisfies the differential équation (2) may be described in the form of an 
orthogonality requirement 

f [ Hy) -/K dx = 0 (k = 1,2, •••}. (3) 

J a 

Let us assume that the solution of the problem may be expanded in 
a sériés in the <u* 


y(x) = ai<u,(x) + a 2 w 2 (x) + —. 


(4) 
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We now seek to détermine the conditions that must be satisfied by the 
coefficients a k . For arbitrary a k the sum of the sériés (4) will satisfy the 
boundary conditions. It remains to choose the a k in such a way that 
équations (3) are satisfied. The coefficients a k form an infinité set, and 
to compute ail of them is generally speaking impossible. For simplification 
we retain only a finite number of terms on the right side of (4) and so 
obtain the expression 

y(x) a» fl,<*>,(*) + — + a„w„(x). (5) 

We cannot hope to satisfy équation (3) for ail w k (k = 1, 2, •••) since 
we hâve only n arbitrary parameters a k (k — 1,2, , ri). Thus we are 

forced to give up an exact solution of the differential équation (2). But 
it is natural to expect that the sum (5) will satisfy this differential équation 
with a small error if n is taken sufficiently large and condition (3) is 
satisfied for the first n of the functions tu*. This leads to the équations 
of Galerkin’s method 

J \L ( X akWk ) ~A w * dx =° 0 =1.2, -, n). 

After finding the a k from these équations, we construct an approximate 
expression for the function (5). 

A similar simplified formula holds for the solution of variational 
problems by the Ritz method, in approximate harmonie analysis of 
functions and in many other questions. 

We give another example of simplification of an équation. Let it be 
required to find a function y of one or several arguments by solving 
some functional équation, for example, a differential or an intégral 
équation. As parameters defining the function y let us choose its values 
y,, y, , •••, y n at some System of points (on a net). 

The functional équation must then be changed to a System of numerical 
équations containing n unknown quantifies y k (k = 1, •••, n). Such a 
replacement may, as a rule, be made in many ways. Here it is always 
necessary to take pains that the solution of the numerical System differs 
sufficiently little from the solution of the functional équation. 

We give several examples of this sort of replacement. When we solve a 
differential équation of the first order y = f(x, y) by Euler’s method, 
we replace this équation by a recursive numerical scheme which enables 
us to make an approximate calculation of each succeeding value of the 
unknown function from the previous value (Chapter V, §5): 


y n+ i = y« + (*„+, — x n )f(x„ , y„). 
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For an approximate solution of the Laplace équation 


A Pu Pu 

* u = s* + W ~ 0 


by the net method, we replace this équation by a linear algebraic System 
(Chapter VI, §5) 

u(x + h, y) + u(x, y + h) + u(x — h, y) + u(x, y —h) — 4 u(x, y) = 0. 

Let us consider one more example of such a kind. Let it be required to 
solve numerically the intégral équation 


y(x) = /( x) + f K(x, s) ><*) ds. (6) 

J a 

The points at which we wish to find the values of the unknown function 
y(x) will be denoted by x,, x 2 , •••, x„ . In order to set up the System of 
numerical équations replacing (6), we require that équation (6) be 
satisfied not for ail the x on the interval a < x < b but only at the 
points x, (/ = 1, 2, •••, n) 

yiXi) = /(x,) + f K(x,, s) y(s) ds. 

J a 

Then we replace the intégral by any approximate quadrature (by the 
trapezoidal rule, Simpson’s rule, or some other)* with the points of 
division x,, , x„ 



n 

, s) y(s) ds AffK(Xi, x { ) y(xi). 


To détermine the desired values of y{x ( ), we hâve the System of linear 
algebraic équations 


y(x,) = f{x,) + 5) A a K(x,, y,) y(x,) (/ = 1, 2, -, n). (7) 

j-i 

We note that ail the methods considered of seeking an unknown 
function hâve involved determining certain parameters which define it 


Cf. Chapter XII, §3. 
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approximately. Thus the exactness of these methods dépends on how 
well the function is defined by this System of parameters; for example, 
how well it may be approximated by an expression of the form (7) or 
represented by its values at a certain System of points. Questions of this 
kind constitute a particular branch of mathematics, called the theory of 
approximation of functions (Chapter XII). From this it can be seen that 
the theory of approximation has very great value for applied mathematics. 

Convergence of approximate methods and an estimate of error. Let us 
examine in more detail the requirements for a computational method. 
The simplest and most basic of these requirements is the possibility of 
finding the desired quantity with any chosen degree of accuracy. 

The required exactness of a computation may change greatly from 
one problem to another. For certain rough technical computations, two 
or three décimal places will be sufficiently exact. Most engineering 
computations are carried out to three or four décimal places. But con- 
siderably greater exactness is often required in scientific calculations. 
Generally speaking, the need for greater accuracy has increased with 
the passage of time. 

Particularly important, therefore, are the approximation methods and 
processes that allow one to get results with as great a degree of accuracy 
as desired. Such methods are called convergent. Since they are encountered 
most often in practice and since the requirements they must satisfy are 
typical, we will keep them in mind in what follows. 

Let x be the exact value of a desired quantity. For every such method 
we may construct a sequence of approximations, x,, jc 2 , , x„ , ••• to the 

solution x. 

After showing how the approximations are constructed, the first 
problem in the theory of the method is to establish the convergence of 
the approximations to the solution x„ -*• x, and if the method is not 
always convergent, to set out the conditions under which it will converge. 

After the convergence is established there arises the more difficult and 
subtle problem of an estimate of the rapidity of convergence, i.e., an 
estimate of how rapidly x„ converges to the solution x for n -* oo. Every 
convergent method theoretically guarantees the possibility of finding the 
solution with any desired degree of accuracy, if we take an approximation 
x n with sufficiently large index n. But, as a rule, the larger the n, the 
greater the labor required to calcuiate x„ . Thus, if x„ converges slowly 
to x, then to get the needed accuracy it may be necessary to make 
enormous computations. 

In mathematics itself, and especially in its applications, many cases 
are known of a convergent process for finding the solution x, which would 
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require more computational work than can be carried out even on present- 
day high-speed computers.* 

Insufficiently fast convergence is one of the criteria by which the 
disadvantages of a given method are judged. But this criterion is, of 
course, not the only one and in comparing methods one must consider 
many other sides of the question, in particular the convenience of making 
the computations on machines. Of two methods we sometimes prefer 
to use the one with somewhat siower convergence, if the computations 
by this method are easier to carry out on a computing machine. 

The error produced by replacing x with its approximate value x„ is 
equal to the différence x — x„ . Its exact value is unknown, and in order 
to estimate the rapidity of convergence, we must find an upper bound 
for the absolute value of this différence, i.e., a quantity A „, such that 

\x — x„\ ^ A n , 

which we call an error estimate. Later we give examples of estimâtes A„ . 
Consequently, the usual method of judging the rapidity of convergence 
of a method is to examine how fast the estimate A„ decreases with 
increasing n. In order that the estimate reflects the actual degree of 
neamess of x„ to x, it is necessary that A„ differ little from | x — x n |. Also 
the estimate A„ must be effective, i.e., be such that it can itself be found, 
otherwise it cannot be used. 

Let x be a numerical variable whose value we wish to détermine from 
some équation. We assume that our équation reduces to the form 

x=-A(x). (8) 


* Let us mention some simple cxamples of slowly converging computational processes. 
It is known that the séries 


convertis to the naturel logarithm of the number 2. We can find In 2 approximately 
by means of this sériés, by computing the sum 


s. = 



of the first n terms for sufficiently large n. But it may be shown that to compute In 2 
with an error less than half of the fifth significant figure, we must take more than 
100,000 terms of the séries. To find the sum of such a number of terms, if we are using, 
for cxample, only a desk computer, would be very laborious. Another familiar example 
is the sériés 


I I 1-3 1-3-5 1 -3-5-7 

y/\~ 2-l! + 2»-2! 2* • 3! + 2‘ • 4! 


Its convergence is so slow that to compute 1/ 'J2 with accuracy of 10”', we would need 
to take about I0 10 terms, which is difficult even with high-speed machines. 
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To this équation we apply the method of itération, which is also often 
called the method of successive approximations. To explain the method 
itself and the estimâtes connected with it, we will examine the case of one 
numerical équation, although the method also applies to Systems of 
numerical équations, to differential équations, intégral équations, and 
many other cases. The application of the method to ordinary differential 
équations has already been illustrated in Chapter V, §5. 

We will assume that we hâve somehow found an approximate value 
x 0 for a root of the équation. If x 0 were an exact solution of équation (8), 
then after substituting it in the right side <f>(x) of the équation we would 
get a resuit equal to x 0 . But since x 0 , generally speaking, is not an exact 
solution, the resuit of the substitution will differ from x 0 . Let us dénoté 
it by x, = <f>(x 0 ). 

In order to establish in which cases x, will be nearer to the exact 
solution than x 0 , we turn to a géométrie interprétation of our problem. 
Let us consider the function 

y = <£(*)• (9) 

We choose a numerical axis and represent the numbers x and y by points 
of this axis. Equation (9) assigns to every point x a corresponding point y 
on the same axis. It may be regarded as a rule that produces a point 
transformation of the numerical axis into itself. 

Consider the segment [x,, x 2 J on the numerical axis. By the trans¬ 
formation (9) the points x, and x 2 will be transformed into the points 


yi = <t>( x i) and y t = -£(x 2 ). 


The segment [x,, x 2 ] is transformed into the segment [y,, >» 2 ]. The ratio 

k _ I Ta — Ti I 
| x 2 - x, 1 


is called the “coefficient of dilation” of the segment under the trans¬ 
formation. If A: < 1, we will hâve a contraction of the segment. 

We return to équation (8). It says that the desired point x must be 
transformed into itself under the transformation (9). Thus solving équation 
(8) is équivalent to finding a point on the numerical axis which is trans¬ 
formed into itself under the transformation (9), i.e., remains fixed. 

We now consider the segment [x, x 0 ], one end of which lies at the 
fixed point x and the other at the point x 0 . Under the given transformation 
x„ goes into x, and the segment [x, x 0 ] into the segment [x, xJ. If the 
function <f> has the property that under transformation (9) every segment 



§1. APPROXIMATION AND NUMERICAL METHODS 


311 


is contracted, then x, will certainly be doser than x 0 to the root of 
équation (8). 

Since we wish to obtain approximations which converge to the exact 
solution of (8), we make the same transformation many times in succession 
on the right side of (8) and construct the sequence of numbers 

*1 = f *o)> *2 = — . *n +1 = <K X n ), — • (10) 

Here we will prove that the sequence of approximations (10) converges.* 
Let us assume that the function f x) is defined on a certain segment 
[a, b] and that équation (9) gives a transformation of [û, b ] into itself, 
i.e., for every x belonging to [a, b], y = <f>(x) will also belong to [o, b], 
We will also assume that the initial approximation x 0 is in [a, b]; ail the 
successive approximations (10) will then also lie in [ a, b ]. Under these 
conditions the following theorem is true. If f x) has a dérivative f 
satisfying the condition 

If I < I 

on [û, h], then the following proposition holds. Equation (8) has a root x* 
in the segment [a, b\. The sequence (10) converges to this root, and the 
rapidity of convergence is characterized by the estimate 

I x* - x„ I < 

1 - q 

where m = | x 0 —f x„)| = | x 0 — x, |. Equation (8) has a unique root 
in [a, b]. 

To prove these statements, we estimate the différence x t — x,. If 
Taylor’s formula is applicable (Chapter II, §9, (26)), we obtain, for n = 0 

x %— x i = f * i)—f * 0 ) = f (&>)(*> — *<>)• 

Then lies between x, and x 0 and so belongs to the segment [o, b]. 
Thus | f (Ol ^ q and 

\x» — I < q I x i ~ x o I = mq. 

Similarly 

I x 3 ~ ** I = I f x ù —<K x i)\ — 1iX-^î — ->fi)l ^ q I x t — x, I ^ mq\ 


Continuing these estimâtes, we hâve, for every value of n, the inequality 

I x*+i ~ x* I < mq n - (H) 


* Because of the géométrie interprétation, this theorem and others like it are often 
called contraction theorems. 
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We now establish the convergence of the sequence x„ . To this end we 
consider the auxiliary sériés 

*0 + C*l x o) + (** x l) + + ( x n *n-l) + • ( 12 ) 

The partial sum of the first n + 1 of its ternis is equal to 

■*»+! = *0 + (*J — * 0 ) + — + (*n — -^n-0 = • 

Thus lim*.,,, s n+1 = lim„_„ x„ and the existence of a finite limit for x„ is 
équivalent to the convergence of the sériés (12). We compare the sériés 
(12) with the sériés 

I *o I + m + mq + — + mq”" 1 + — • 

From the estimate (11) the terms of the sériés (12) are not greater in 
absolute value than the corresponding terms in the latter sériés. But this 
sériés, except for its first term | x 0 1 , is a géométrie progression with 
common ratio q, and since q < I, the sériés converges. Sériés (12) is 
thus also convergent, and the sequence (10) is convergent to some finite 
limit x * 

lim x. = x*. 

n-»oo 

Obviously x * belongs to the segment [a, b], since ail the x„ belong to it. 
If in the équation x n+ , = </>(*„) we pass to the limit as n -► oo, then in 
the limit we get the équation x* = <f>(x*), which shows that x* actually 
satisfies équation (8). We now estimate how close x„ is to x*. We choose 
x„ and any following approximation x„ +p 

I x n+p x n I = I ( x n+v x n+p- 1) (-^n+p-1 x n+ p-î) + "t* ( x n+l x n) I 

< mq n+p ~ l + mq n * p -‘ l + ••• + mq” 

mq n — mq p+n 
1 - q 

Hence, for p -*■ oo, from x„+ p -*■ x* and q n+p —► 0 it follows that 

I** -*nl <-r^—q n - 
1 - q 

It remains to prove the statement on uniqueness. Let x' be any solution 
of the équation on [a, b]. We estimate the différence x' — x* 

\x'-x*\ = \ 4ix') -<K X *)\ = \<t>'(i)( x ' - x *)\ < q I — x * I, 
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from which 

(\-q)\x'-x*\ <0. 

Since 1 — q > 0, this inequality is possible only for | x' — x* | = 0, 
which means that x' is identical with x*. 

The theorem not only exhibits sufficient conditions for the convergence 
of the method of itération but also allows us to estimate the necessary 
number of steps in the computation, i.e., how large n must be taken to 
obtain the required accuracy when the exact solution x* is replaced by x„ . 
Such an estimate is effective, since the quantities m and q appearing in 
the inequality | x* — x„ | ^ (m/l — q)q n may in fact be found by in- 
vestigating the function <f>. 

As an example let us consider the équation x = A: tan x, which has 
many practical applications. Fordefiniteness, we consider the case k = 0.5. 
Let it be required to find the smallest positive root of the équation 
x = £ tan x. It must lie near the point 1 and be somewhat larger than 1, 
as can be easily established from any table or graph of the function tan x. 

To secure the condition \<f>' | ^ q < 1, which enters into the theorem 
on the convergence of the method of itération, we invert the function tan x 
and consider the équation x = arc tan 2x, which is équivalent to the 
given one. 

We give here the results of the computation. For the original approxi¬ 
mation we hâve taken the value x 0 = 1. The following approximations 
are computed from a table of the function arc tan x, from which one 
finds the following numerical values 

x, =arctan2 =1.10715, 
x 2 = arc tan 2.21430 = 1.14660, 
x 3 = arc tan 2.29320 = 1.15959, 
x 4 = arc tan 2.31918 = 1.16370, 
x s = arc tan 2.32740 = 1.16498, 
x, = arc tan 2.32996 = 1.16538, 
x 7 = arc tan 2.33076 = 1.16550, 
x„ = arc tan 2.33100 = 1.16554, 
x„ = arc tan 2.33108 = 1.16555, 
x I0 = arc tan 2.33110 = 1.16556, 
x n = arc tan 2.33112 = 1.16556. 

The computation may be stopped here, since further itérations will 
repeat the value of the root 


x* = 1.16556. 
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A géométrie illustration of the approximations to the root is given in 

figure 1. Here x„ tends to x* 
so rapidly that x 4 is already 
indistinguishable from x* in 
the diagram. 

Let us give one more 
example of the method of 
itération. We solve numeri- 
cally the intégral équation 



y(x) = ï f **y(t)dt + f 


1 1 


6x + 1 


(e x+ï - 1). (13) 


Fig. 1. Its exact solution is y = e*. 

First we replace the inté¬ 
gral équation by a system of linear algebraic équations. To this end 
the interval of intégration (0, I] is divided into four equal parts at 
the points i = 0, J, £, 1. The values of the unknown function 
y at these points will be denoted by y„ » yi > y% . )>3 » y* . respectively. If 
we require that the équation be satisfied for x 0 = 0, J, £, $, 1, when the 
intégral is replaced by Simpson’s sum for four partial intervals (Chapter 
XII, §3, (6)), we hâve the following system of équations for y„ : 


y 0 = £ (0.083333>’o + 0.333333y, + 0.166667y s 

+ 0.333333 y 3 -(- 0.083333y 4 ) + 0.713619, 
y, = i (0.083333>>o -I- 0.35483 ly, + 0.188858^ 

+ 0.402077y 3 + 0.107002y 4 ) + 0.951980, 
yt = £ (û.083333^0 + 0.3777I6.V, + 0.214004^ 

+ 0.484997^ + 0.137393y 4 ) + 1.261867, 
y 3 = i (0.083333^ 0 + 0.402077>' 1 + 0.242499^ 2 

+ 0.585018>-3 + 0.176417y 4 ) + 1.664181, 
y t = ^ (0.083 333y 0 + 0.428008y, + 0.274787y 2 

+ 0.705667j- 3 + 0.226523y 4 ) + 2.185861. 


This system is solved by the method of itération. As our initial approxi¬ 
mation to y k (k = 0, 1, 2, 3, 4) we will take the constant terms of the 
corresponding équations: >»ô 0> = 0.713619, yj 01 = 0.951980, •••. The 
values found for the successive approximations are given in Table 1: 



§1. APPROXIMATION AND NUMERICAL METHODS 


315 


Table 1. 


Number of 
Approximation 

y » 

yi 

y . 

y » 

y t 

1 

0.93428 

1.20841 

1.56129 

2.01542 

2.59972 

2 

0.98517 

1.26699 

1.62905 

2.09419 

2.69173 

3 

0.99667 

1.28021 

1.64433 

2.11194 

2.71245 

4 

0.99926 

1.28319 

1.64778 

2.11595 

2.71713 

5 

0.99985 

1.28386 

1.64856 

2.11685 

2.71818 

6 

0.99998 

1.28402 

1.64873 

2.11705 

2.71842 

7 

Value of the exact 

1.00001 

1.28405 

1.64877 

2.11710 

2.71847 

solution 

1.00000 

1.28403 

1.64872 

2.11700 

2.71828 


At the end of Table I the value of the exact solution is given for 
comparison. Further approximations would not improve the values of y k . 
The divergence in the last digits in the y k cornes from the error introduced 
by replacing the intégral by Simpson’s sum. 


Stability of approximate methods. The needs of practical computation 
impose on approximative methods another general requirement that must 
be kept in mind because of its great importance. This is the requirement 
of the stability of the computational process. The essence of the matter 
is as follows: Every approximative method leads to some computational 
scheme, and it often turns out that to produce ail the required numbers, 
we must carry out a long sériés of computational steps in accordance 
with the scheme. At each step the computation is not carried out exactly 
but only to some spécifie number of significant figures, and thus at each 
step we introduce a small error. Ail such errors will hâve their influence 
on the final results. 

The computational scheme adopted may sometimes turn out to be so 
unsatisfactory that small errors made at the beginning may hâve a greater 
and greater influence as the calculations are carried further and may 
produce in the final stages a wide déviation from the exact values. 

Let us consider the numerical solution of a diflerential équation 

ÿ = /(*. y) 

with the initial condition y(x 0 ) = y 0 , where we are required to find the 
values of y{x) for equally spaced values x k = x a + kh (k = 0, 1, 

We assume that the computation has begun and has been carried out 
to step n with the results shown in Table 2. 
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Table 2. 


X 

y 

y' =f 

X. 

y« 

y» 


yi 

y'i 

■X„-l 

y--i 

y.-i 


y. 

y. 


We must now find y n+l . By the Euler method of broken lines we make the 
approximation 

y« +i = y« + hy' n . (14) 

Here y n+l is calculated only from the numbers y„ and y'„ which occur 
in the last line of Table 2. Suppose we wish to increase the accuracy 
and for this purpose make use of ail the quantities appearing in the last 
two lines. Then we may construct the computational formula 

y»< I = -4 y« + + h(4y'„ + 2 V ;_,). (15) 

We note that if the computation is absolutely exact, i.e., with an infinité 
number of significant digits, then formula (14) will give the exact resuit 
whenever y is a linear polynomial, and formula (15) will be exact for 
every polynomial of degree through the third. It would seem at first 
glance that the results produced by applying formula (15) must be more 
exact than those found by the method of broken lines. However, it can 
easily be seen that formula (15) is inappropriate for computation, since 
its application may produce a rapid increase in the error. 

The values of the dérivative y'„ and y'_, contain a small multiplier h, 
so that the errors in these values hâve less influence than the errors in y„ 
and y„_ t . For simplicity we will assume that the values of / are found 
exactly so that we do not need to take them into account in the following 
attempt to estimate the error in general in the above two cases. Let us 
suppose that in finding , we make an error of + e, and in finding y„ 
an error of — e. Then, as équation (15) shows, in y„ +1 we will make an 
error of the magnitude of + 9e. In y n+2 the error will be — 41e and will 
grow rapidly as we continue. Formula (15) leads to a computational 
process that is unstable with respect to errors and must be discarded. 

The example given shows how badly the results may be distorted by 
an unstable computational scheme. Here we hâve solved the differential 
équation ÿ = y with the initial condition y 0 = I. The exact solution is 
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y + e x . For the numerical solution we took equally spaced values of the 
independent variable x with steps h = 0.01, i.e., x k = 0.01 k. An approxi- 
mate solution was computed in two ways: by the method of broken lines 
(14) and by formula (15). For comparison, Table 3 gives the value of 
the exact solution to seven décimal places. 

The approximate values of the solution found by formula (15) are more 
exact for the first few steps than the results given by the method of broken 
lines. But after a small number of steps the instability of formula (15) 
begins to distort the approximate values of y k quite strongly and leads 
to numbers which are very different from the true values of y k . 


Table 3. 


X 

Values 
of the 

Exact Solution 

Values of the 

Approximation Solutions Computed 

by Formula (14) 

by Formula (15) 

0.00 

1.0000000 

1.0000000 

1.0000000 

0.01 

1.0100502 

1.0100000 

1.0100502 

0.02 

1.0202013 

1.0201000 

1.0202012 

0.03 

1.0304545 

1.0303010 

1.0304553 

0.04 

1.0408108 

1.0406040 

1.0408070 

0.05 

1.0512711 

1.0510100 

1.0512899 

0.06 

1.0618365 

1.0615201 

1.0617431 

0.07 

1.0725082 

1.0721353 

1.0729726 

0.08 

1.0832871 

1.0828567 

1.0809789 

0.09 

1.0941743 

1.0936853 

1.1056460 

0.10 

1.1051709 

1.1046222 

1.0481559 

0.11 

1.1162781 

1.1156684 

1.3996456 

0.12 

1.1274969 

1.1268250 

-0.2808540 


Choice of computational methods. Every computation may in the 
final analysis be reduced to the four arithmetic operations of addition, 
subtraction, multiplication, and division. Describing a method of computa¬ 
tion consists of stating the initial data with which one begins and then 
prescribing which arithmetical operations, and in which order, are to be 
performed in order to get the desired results. Let us show by a very 
simple example how much dépends in the organization of the calculations 
on the expérience and knowledge of the mathematician responsible for 
setting up the computational scheme and what excellent results can be 
obtained by a suitable choice of methods especially adapted to the 
situation. 
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Let it be required to solve the System of n équations in n unknowns 

» x 2 » > x n 

+ °u x i + — + a ln x„ = b ,, 
a n x i + o 22 x 2 4- + a 2 „x„ = b 2 , 


a«i x i + a n2 x 2 + — + a„ n x„ = b „. 

From the theory of algebraic Systems (Chapter XVI, §3) we hâve an explicit 
expression for the values of the unknowns by means of déterminants 

= j (J = 1,2,-, n). (16) 

Here A is the déterminant of the System 



ûn 

fl,J - 

a i» 

A = 

a u 

fl M - 

a 2 „ 


°nx 

a„ 2 - 

Qnn 


and A s is the déterminant obtained from A by replacing its y'th column 
by the column of constant terms in the System. 

Let us assume that we wish to make use of formula (16) to solve the 
System and that we hâve begun to compute the déterminants on the basis 
of their usual définition, without recourse to any simplifications. How 
many multiplications and divisions will be necessary? (Addition and 
subtraction will not be taken into account, since they are relatively simple 
operations.) We face the prospect of computing n + I déterminants of 
order n. Each of them consists of n] terms, each term being the product 
of n factors and consequently requiring n — 1 multiplications. For the 
computation of ail the déterminants, we must carry out (n + I) /?! 
x ( n —1) multiplications. The total number of multiplications and 
divisions will be equal to ( n 1 — l)/i! + n. 

We now choose another method of solving the System, namely successive 
élimination of the unknowns. The scheme of computation corresponding 
to this method is associated with the name of Gauss. We find x 2 from 
the first équation of the System 


x, = — -— x 2 — 

au a„ 


_ r 

„ -*n 


For this we need n divisions. Substituting x, in each of the following n — 1 
équations requires n multiplications. The élimination of x, and the setting 
up of n — 1 équations in the unknowns x 2 , , x„ will then require n 2 
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multiplications and divisions. Continuing in this way, we find that to 
compute ail the values of xj (j — 1, , n) the élimination method requires 

n/6 (2n* + 9 n — 5) multiplications and divisions. Let us compare these 
two results. For the solution of a System of five équations in the first 
case we would need 2,885 multiplications and divisions, and in the second 
case 75. 

For a System of ten équations the number of operations will be 
(10 2 — 1) 10! + 10 ss 360,000,000 and 10/6(2 • 10* + 9 • 10 — 5) = 475, 
respectively. So we see that the amount of computational labor dépends 
very strongly on the choice of the method of computing. Tn organizing 
the scheme of computation, it is often possible by a rational choice of 
the method to reduce the necessary amount of work very greatly. 

§2. The Simplest Auxiliary Me ans of Computation* 

Tables. The oldest auxiliary means of computation consists of tables. 
The simplest tables, e.g. the multiplication table and tables of logarithms 
or of the trigonométrie functions, are certainly well known to the reader. 
The range of problems that are solvable in practical affairs is being 
continuously extended. New problems are often solved by the application 
of new formulas or may lead to new functions, so that the number of 
tables required is constantly increasing. 

Every table, regardless of how it is constructed, contains the results of 
earlier computations and therefore represents a sort of mathematical 
memory. Printed or written tables are intended to be read by human 
beings. But we might also consider tables formed in some spécial manner, 
for example by holes punched in some spécial manner in cards, which 
are intended to be read by computing machines. But such tables are 
considerably rare and we will not discuss them here. 

The tables in widest use are those of the values of functions. If a function 
y dépends on only one argument x, then the simplest table corresponding 
to it has the form 

y 

y\ 

Vx (17) 

y. 

* In this section we give a description only of the simplest auxiliary equipment and 
machines. The description of contemporary rapid computing machines is given in 
Chapter XIV. For lack of space we hâve also omitted graphical methods. 
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This is called a single-entry table.* From it we may take without further 
effort only the values corresponding to tabulated values of x. Values 
corresponding to x not in the table must be found by interpolation of 
various kinds, as described in Chapter Xll.t Consequently the tables 
often contain, in addition to the values of the functions, certain auxiliary 
quantities which make the interpolation easier. Usually these are values 
of the first or second différences. More specialized tables require specially 
devised interpolation formulas for which they include the corresponding 
data. 

In a table of a function of two arguments u = f(x, y) the values of the 
function are distributed in a double-entry table of the following form 


y 

X 'v 

yi 

yt 


y- 

Xi 


«î. 



Xt 

U„ 

... 

“i« 


Utm 

X. 



• •• 

U.» 


( 17 ') 


Each column of such a table is itself a single-entry table, so that (17') 
is a collection of many tables of the form (17). The size of a table for a 
function of two arguments is, as a rule, much greater than for a function 
of one argument with the same interval for the independent variables. 
In view of this, functions of two arguments are much less often tabulated 
than functions of one argument. 

How quickly the size of a table can grow with an increase in the number 
of arguments is shown by the following simple example. Let it be required 
to tabulate a function of four arguments f(x, y, z, t) for 100 values of each 
of the arguments. Let us assume that the function does not need to be 
computed very exactly, only to three significant figures. If under such 
conditions we tabulate a function of one argument, the whole table of 
values will consist of a hundred three-digit numbers and may easily be 
put on one page. 


* Such a column may be very long and may (herefore be broken up into many smailer 
columns for convenience of printing. But of course it is still called a single-entry table. 

t Interpolation, as a rule, is more complicated if the tabulated values x, are farther 
apart and simplcr if they are doser together. Moreover, the requirement concerning 
rapidity of interpolation may vary widely. fn tables designed for artillery use, inter¬ 
polation must be done almost instantly, "at sight.” But in tables of higher accuracy, 
designed for use in the sciences, we may allow interpolations which require a whole 
sériés of operations. 
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But in a four-entry table for the function f(x, y, z, t), we will hâve 
100 4 combinations of the values of x, y, z, t and as many values of /, 
from which it is easy to calculate that the table would fill more than 300 
volumes. 

Because such tables are so unwieldy, functions of many arguments are 
seldom tabulated and then only in particularly simple cases. In the last 
few years there has begun a systematic study of classes of functions of 
many variables for which tables may be formed with a number of entries 
less than the number of arguments. At the same time studies hâve been 
begun on the simplest possible construction of such tables. 

We give a simple example of such a function. 

Let it be required to tabulate the function u of three arguments x, y, z 
with the following structure 


“ = fWx, y), z). 

It is perfectly clear that here one may restrict oneself to two double- 
entry tables if we introduce the auxiliary variable t = <f>(x, y) and consider 
u as the composite function 

w = /(', z), 
t = y)- 

For convenience in the use of these tables, we may combine them in 
the following manner. We consider the function t = <f>(x, y) and solve 
this équation with respect to y 

y = <t>(x, t). 


In theory it makes no différence which of the functions t = tj>(x, y) or 
y = 0(x, t) is tabulated, but it will be more convenient for us to tabulate 
the second of them. We construct two double-entry tables for the functions 
y — 0(x, t) and u = f(t, z) and combine them in the manner shown in 
Table 4. 


Table 4. 
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The value of u which corresponds to given values x it y,, z k is 
found as follows: We find the column headed by x t and running 
down it, pick out the value y, (or one near it). In the horizontal 
row through it will be the corresponding value of t. Moving further 
along this horizontal row we find in column z* the required value 
" = f(x ,, y,, z k ). 

In this example we see that, rather than make a triple-entry table, we 
may restrict ourselves to two double-entry tables with a simple rule for 
operating with them. 

The use of various possible methods of shortening tables allows us in 
certain cases to decrease the size of the tables by a factor of ten, a hundred, 
or even a thousand in comparison with tables in which the number of 
entries is equal to the number of independent arguments. 

Desk computers. Almost as old as tables as an aid to computation 
are various computing devices. Some of them were used even in ancient 
Greece. 

The first models of calculating machines were constructed in the 17th 
century by Pascal, Moreland, and Leibnitz. From that time on the 
machines were repeatedly changed and improved and were in wide use 
by the end of the last century and especially at the beginning of the 
présent one. 

We will only look at certain forms of machines and will consider the 
possibility of speeding up the computations which they perform. We begin 
with the small, so-called universal desk computers. Each of these, in- 
dependently of its construction, is designed to perform the four arithmetic 
operations, with multiplication and division being done by repeated 
sériés of additions and subtractions. 

A typical early model of such a machine is the wheeled arithmometer 
of Odner. Entering a number into the adjustable mechanism is accom- 
plished by moving a lever the necessary number of notches corresponding 
to each digit of the number. In the process of addition each summand 
is entered into the adjustable mechanism and then, by one rotation of 
the handle, is transferred to the accumulator, where it is automatically 
added to the number already there. Subtraction corresponds to a rotation 
of the handle in the opposite direction. Multiplication is carried out by 
entering the multiplicand into the adjustable mechanism and then repeated¬ 
ly adding it to itself for each digit of the multiplier. For example, to 
multiply by 45 corresponds to five repeated additions of the multiplicand 
and then four repeated additions of the same number moved over one 
place. 

For division the dividend is placed in the accumulator and the quotient 
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is found by repeated subtraction of the divisor, digit by digit. The resuit 
is determined by the number of rotations of the handle needed in each 
digit place to remove the number from the accumulator. 

We hâve given this brief description of the computations here only in 
order to make clear the direction of further improvements in desk 
calculators. Some of these improvements hâve merely made the machines 
more convenient without changing the basic scheme of their construction. 
An improvement of this kind is the introduction of electricity, which 
accelerates the action of the machine and frees the operator from having 
to turn the handle. 

To accelerate and simplify the entering of numbers into the adjusting 
mechanism, keys for receiving instructions were introduced. The entering 
of given digits is carried out, not by rotating a lever for the spécifie number 
of notches, but simply by punching the corresponding key. Calculators 
were invented on which it is sufficient for the operator to enter the number 
on which it is desired to perform a given operation and then to punch 
the key which tells which of the four operations is to be performed. 
The machine will carry on from there without further human intervention. 
The improvement of desk computers also brought about a remarkable 
increase in their rapidity, so that in the latest models the resuit of a mul¬ 
tiplication is obtained within one second after punching the keys. Further 
accélération in the action of such machines is obviously superfluous, 
since it takes considerably longer than that for the operator merely to 
punch the keys and record the results. 

Digital (punched card) machines and relay machines. Digital machines 
were invented for statistical computations and for financial and industrial 
use. They are designed to carry out a large number of uncomplicated 
computations of the same kind. They are less convenient for technical 
and scientific calculations because of their very small operating “memory” 
and the restricted possibility of establishing computational programs for 
them. In spite of these deficiencies, digital machines, up to the appearance 
of fast-acting electronic machines, were quite widely used in complicated 
and large-scale calculations when the whole process could be reduced to 
a fairly short sequence of operations to be carried out on a massive scale 
(for example, in preparing tables). 

The numbers with which the digital machine opérâtes are entered on 
punched cards (figure 2). The digits and symbols are entered on the card 
by means of a punch in spécifie places. The card is introduced into the 
machine through a System of brushes. A brush under which a hole is 
passing closes an electrical circuit and sets in operation a given phase 
of the machine. 
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The different types of digital machines are designed to work in sets, 
each set containing at least the following machines: 

A card-punch serves to punch the holes in the cards. The machine has 
a keyboard operated by hand and works at the speed of a typewriter. 

A sorter is designed to arrange the cards in the order in which they are 
to be introduced into the calculating machines. The speed of the work is 
450-650 cards per minute. 

A reproducing punch or reproducer transfers punches from one card to 
another, compares two sets of cards, and selects from them cards with 
spécifie perforations. The speed of working is around 100 cards per 
minute. 

A tabulator performs the operations of addition and subtraction and 
also prints out the results. It may handle 6,000 to 9,000 cards an hour. 

A multtplying punch (multiplier ) adds, subtracts, and multiplies numbers. 
The results are given in the form of punches on the cards. In working 
with numbers of 6 or 7 digits it may perform 700-1,000 multiplications 
an hour. 

Digital machines work rather slowly. As a rough estimate of the amount 
of work they can perform, we may say that the above set of machines 
can replace 12 to 18 desk computers. The first attempts to create faster 
machines led to the construction of relay machines based on the application 
of electromechanical relays. The rate of work of such machines turned 
out to be about ten times as great as the speed of the simple digital 
machines. But the gains in other respects were remarkable: Relay machines 
carry out complicated computational programs and hâve a flexible control 
System that greatly extended the range of technical and scientific problems 
solvable on machines. However, the appearance of these machines almost 
coincided in time with the création of the first models of electronic 
machines with programmed control, and these led to a further sharp 
increase in the working speed. As an indication of the great increascs 
in speed which hâve been made possible by the invention of electronic 
machines, we may point out that the time required for a change of State 
in an electronic tube is measured in millionths of a second. 

Mathematicai machines with continuous action (analogue machines). 

Mathematical machines with continuous action are made up of physical 
Systems (mechanical apparatus, electrical circuits, and so forth), con- 
structed in such a manner that the same numerical interrelations occur 
among the continuously changing parameters of the System (displacements, 
angles of rotation, currents, voltages, and so forth) as among the cor- 
responding magnitudes in the mathematical problem to be solved. Such 
machines are often called simulating (or analogue) machines. 
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Every machine with continuous action is especially designed for the 
solution of some narrow class of problems. 

The accuracy with which the machine gives the solution dépends on 
the quality of manufacture of the component parts, the assembling and 
calibration of the machine, the inertial errors in its operation, and so 
forth. On the basis of lengthy expérience in using the machines, it has 
been established that as a rule they are capable of an accuracy of two 
or three significant digits. In this respect simulating machines are notably 
inferior to digital machines, whose accuracy is theoretically unlimited. 

An important characteristic of machines with continuous action is that 
they are suitable for the solution of a large number of problems of one 
type. In addition, they often produce the solution with considerably 
greater rapidity than a digital machine. Their principal advantage consists 
of the fact that in many cases it is more convenient to introduce the initial 
data of the problem into them, and also the results are often obtained 
in a more convenient form. 

There are many different types of simulating machines. It is possible 
to create machines, or parts of machines, that are models of various 
mathematical operations: addition, multiplication, intégration, différentia¬ 
tion, and so forth. We may also simulate various formulas used in 
computation; for example, we can construct machines to compute the 
values of polynomials or the Fourier coefficients in harmonie analysis 
of functions. We may also simulate numerical or functional équations. 
The many analogies that exist between problems from completely different 
branches of science lead to the same differential équations. Identity of 
the équations involved allows us for example, to simulate heat phenomena 
by electrical means and to solve problems in heat engineering by means 
of electrical measurements, a procedure that is certainly convenient, since 
electrical measurements are more exact than measurements of heat and 
are much easier to make. 

In view of the large number of simulating machines, it is impossible 
to describe in a few words the machines themselves or even the principes 
of their construction. To give the reader at least some idea of how 
mathematical problems may be simulated, let us give a short description 
of two simple mathematical machines, one of which is designed for 
intégration of functions and the other for approximate solution of the 
Laplace équation. 

The friction integrator (figure 3) is designed, as the name indicates, to 
integrate functions. It works by friction. The basic idea of its construction 
is shown in figure 4, where the component 1 is the base of the integrator, 
2 is a horizontal friction dise with a vertical shaft, 3 is a friction roller, 
i.e., a roller with a smooth rim which can not only roll along the dise 
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but also move in the plane perpendicular to the plane of rolling. Compo- 
nents 4 and 5 constitute a screw mechanism in which the screw 4 is 
connected with the carriage bearing the roller. If the pitch of the screw 
is denoted by h, then rotation of the screw through angle y will transfer 
the roller over a distance p = hy in the plane of the drawing. 



Fig. 3. Fig. 4. 


Let the shaft of the dise be rotated through angle dix. The point of 
contact of the roller will then move through an arc of length p dix. If the 
roller moves over the dise without slipping, the angle of rotation of the 
roller will be equal to 

d<t> = - P -da= 

We assume that the rotation of the shaft of the dise began with angle a 0 
and the initial angle of rotation of the roller was <f> 0 . From this équation 
we obtain by intégration 



By suitable choice of the relation between the angles y and a, we can 
use the friction integrator to compute a desired intégral in a wide variety 
of cases. By means of integrating mechanisms it is possible to obtain 
a mechanical solution of many diflerential équations. 

We turn to the second example. Let a domain Q be given in the plane, 
bounded by a curve /. It is required to find a function u which inside 
the domain satisfies the Laplace équation 
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and on the contour / takes given values 

u \i — f 

We introduce a square net of points 


x k = x 0 + kh, y k = y 0 + kh, k = 0, ± 1, ± 2 , — , 


and replace the domain Q itself by a polygon composed of squares. 
Corresponding to the contour / we hâve a broken line. We transfer the 
boundary values of/on / to this broken line. The value of the unknown 
function u at a node (x, , y k ) is denoted by u ik . To secure an approximate 
solution of the Laplace équation in Q, we replace it by an algebraic 
System, which must be satisfied for ail interior points of the domain: 

«,* = $(«0+i.* + ", .k+l + l.k + ",.*-1)- 

For a solution of this algebraic System, we may construct the following 

electrical model. We introduce in 
the plane a two-dimensional con¬ 
duction net, the scheme of which 
is illustrated in figure 5. The 
résistance between two nodes is 
assumed to be everywhere the 
same. At the boundary nodes of 
the net, we now apply voltages 
equal to the boundary values of u 
at these nodes. These voltages 
will détermine the voltage at ail 
interior points of the net. We 
dénoté by U uk the voltage at 
the node (x ,, y k ). If we apply 
KirchhofT’s law to the node 
(x ,, y k ), it is clear that at this 
node the following équation will be satisfied 

4- 1(<W - u>.*) + W.* + i - </,.*) 

+ - v,. k ) + (t/,.*_, - t/,.*)] = 0, 

which diflers only in notation from the previous équation for our algebraic 
System. At the nodes of the net the values u ik of the solution of the algebraic 
System must agréé with the voltages U lk , which can be obtained from 
the model by the usual electrical measurements. 
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XIV 


ELECTRONIC 
COMPUTING MACHINES 


§1. Purposes and Basic Principles of tbe Operation of Electronic Computers 

Mathematical methods are widely used in science and technology, but 
the solution of many important problems involves such a large amount 
of computation that with an ordinary desk calculator they are practically 
unsolvable. The advent of electronic computing machines, which perform 
computations with a rapidity previously unknown has completely 
revolutionized the application of mathematics to the most important 
problems of physics, mechanics, astronomy, chemistry, and so 
forth. 

A contemporary universal electronic computing machine performs 
thousands and even tens of thousands of arithmetic and logical operations 
in one second and takes the place of several hundred thousand human 
computers. Such rapidity of computation allows us, for example, to 
compute the trajectory of a flying missile more rapidly than the missile 
itselfflies. 

In addition to their great rapidity in performing arithmetic and logical 
operations, universal electronic computing machines enable us to solve 
the most diverse problems on one and the same machine. These machines 
represent a qualitatively new method which, besides an enormously 
increased production of standard results, makes it possible to solve 
problems previously considered quite inaccessible. 

In many cases the computations must be carried out with great rapidity 
if the results are to hâve any value. This is particularly obvious in the 
example of predicting the weather for the following day. With hand 
calculators the computations involved in a reliable weather forecast for 
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the next day may themselves require several days, in which case they 
naturally lose ail practical value. The use of electronic computing 
machines for this purpose makes it possible to secure the complété results 
in plenty of time. 

The high-speed electronic computing machine. The high-speed elec¬ 
tronic computing machine (BESM) which was constructed in the Institute 
for Exact Mechanics and Computing Technology of the Academy of 
Sciences of the USSR is an example of such a machine. In one second 
the machine performs between 8,000 and 10,000 arithmetic operations. 
We scarcely need to remind the reader that on a desk calculator an ex- 
perienced operator can carry out only about 2,000 such operations in 
one working day. Consequently, the electronic computer can perform in 
a few hours computations that the experienced operator could not perform 
in his whole lifetime. One such machine would replace a colossal army 
of tens of thousands of such operators. Merely to give them a place to 
stand would take up several hundred thousand square yards. 

Thèse electronic machines hâve been used to solve a large number of 
problems from various domains of science and technology. As a resuit 
économies hâve been achieved amounting to hundreds of millions of 
dollars. We give several examples. 

For the international astronomical calendar the orbits of approximately 
seven hundred asteriods were computed in the course of a few days, 
account being taken of the influence on them of Jupiter and Saturn. 
Their coordinates were determined for ten years ahead and their exact 
positions were given for every forty days. Up till now such computations 
would hâve required many months of labor by a large computing 
office. 

In making maps from the data provided by a geodetic survey of a 
given locality, it is necessary to solve a System of algebraic équations 
with a large number of unknowns. Problems with 800 équations, requiring 
up to 250 million arithmetic operations, were solved on the electronic 
machine in less than twenty hours. 

On the same machine tables were calculated to détermine the steepest 
possible slope for which the banks of a canal would not crumble, and 
in this way large savings of time and material were eflected in the con¬ 
struction of hydroelectric power stations. In previous attempts fifteen 
human computers had worked without success for several months 
in an effort to solve this problem for only one spécial case. On the 
electronic machine the computations for ten cases took less than three 
hours. 

On the machine one may rapidly test many different solutions for given 
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problems and choose the most appropriate. Thus one may détermine, 
for example, the most appropriate mechanical construction of a bridge, 
the best shape for the wing of an airplane, or for the nozzle of a jet motor, 
the blade of a turbine, and so forth. 

The practically infinité accuracy of the computations makes it possible 
to construct very rapidly ail kinds of tables for the needs of science and 
technology. On the BESM the construction of a table containing 50,000 
values of the Fresnel intégral required only one hour. 

Applications of electronic computing machines to problems of logic. 

In addition to handling mathematical problems, we may also solve 
logical problems on an electronic computing machine; for example, we 
may translate given texts from one language into another. In this case, 
instead of storing numbers in the machine, we store the words and numbers 
that take the place of a dictionary. 

Comparing the words in the text with the words in the “dictionary,” 
the machine finds the necessary words in the desired language. Then by 
means of grammatical and syntactical rules, which are described in the 
form of a program, the machine “processes” these words, changing them 
in case, number or tense, and setting them in the right order in a sentence. 
The translated text is printed on paper. For a successful translation a very 
large amount of painstaking work on the part of philologists and mathe- 
maticians is needed to set up the programs. 

Experimental dictionaries and programs for the translation of a 
scientific-technical text from English into Russian were set up at the 
Academy of Sciences of the USSR, and at the end of 1955 the first ex¬ 
perimental translation was produced on the BESM machine, even though 
this machine is not especially adapted for translation. 

By way of experiment complicated logical problems were successfully 
solved on the BESM; for example, chess problems. A complété analysis 
of chess is not possible on present-day electronic machines in view of 
the enormous number of possible combinations. As an approximate 
method the relative values of the various pièces are estimated; for example, 
ten thousand points for the king, one hundred for the queen, fifty for a 
rook. Various positional advantages are also estimated to be worth a 
certain number of points; i.e., open files, passed pawns, and so forth. 
By a sériés of trials the machine chooses the course of action that after 
a specified number of moves produces the greatest number of points 
for ail possible answers on the part of the opponent. However, in view 
of the enormous number of possible combinations the solution is neces- 
sarily restricted to trying a comparatively small number of moves, which 
excludes the study of strategie plans of play. 
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Basic principles of the operation of electronic computing machines. 

A present-day electronic computing machine consists of a complicated 

complex of éléments of elec¬ 
tronic automation: électron 
tubes, germanium crystal él¬ 
éments, magnetic éléments, 
photoelements, resistors, 
condensers, and other élé¬ 
ments of radio technology. 

Arithmetic operations are 
performed with colossal ra- 
pidity by electronic com¬ 
puting devices, which are 
assembled in the arithmetic 
unit (figure 1). 

But to guarantee high 
speed for computations it is 
not enough just to perform 
rapid arithmetic operations 
on numbers. In the machine the whole computational process must be 
completely automatic. Access to the required numbers and establishment 
of a spécifie sequence of operations on them are set up automatically. 

The numbers on which the operations are to be performed and also 
the results of intermediate calculations must be stored in the machine. 
An entire mechanism, the so-called “memory unit” is designed for this 
purpose; it allows access to any required number and also stores the 
resuit of the computation. The capacity of the memory unit, i.e., the 
number of numbers that may be stored in it, to a great extern détermines 
the flexibility of the machine for the solution of various problems. 

In present-day electronic machines the capacity of the memory unit 
is from 1,000 to 4,000 numbers. 

The extraction of the required numbers from the memory unit, the 
operation that must be performed on these numbers, the storing of the 
resuit in the memory unit and the passage to the next operation are ail 
guided in the electronic computing machine by a control unit. After 
the computing program and the initial data are introduced into the 
machine the control unit guarantees the fully automatic character of the 
computational process. 

To introduce the initial data and the computational program into the 
machine, and also to print the results on paper, is the purpose of spécial 
input and output units. 

When we are using the machine for making computations, we must 



Fig. I. Diagram of the basic units of an 
electronic digital computer. 
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hâve confidence in the correctness of the results produced; i.e., we must 
hâve some means of checking them. Vérification of the correctness of 
the computations is effected either by means of spécial vérification 
mechanisms or by the usual methods of logical or mathematical vérification 
embodied in a spécial program. The simplest example of such a vérification 
is the “duplication check” (the so-called “calculation on both hands”), 
which consists of computing twice and collating the results. 

Before proceeding to the solution of a particular problem, we must 
first of ail, on the basis of the physical process under investigation, State 
the problem in terms of algebraic formulas, or of differential or intégral 
équations, or other mathematical relations. Then by applying well- 
developed methods of numerical analysis, we can almost always reduce 
the solution of such a problem to a spécifie sequence of arithmetic 
operations. In this way the most complicated problems are solved by 
means of the four operations of arithmetic. 

To perform any arithmetic operation by hand computation it is neces- 
sary to take two numbers, perform the given arithmetic operation on 
them, and write down the resuit produced. This resuit may be necessary 
for further computations or may itself be the desired answer. 

The same operations are also carried out in electronic computing 
machines. The memory unit of the machine consists of a sériés of locations 
or cells. The locations are ail enumerated in order, and to select a number 
for calculation, we must give the location in which it “is stored.” 

To perform any one arithmetic operation on two numbers, we must 
give the locations in the memory unit from which the two numbers are 
to be taken, the operation to be performed on them, and the location 
in which the resuit is to be placcd in the memory. Such information, 
presented in a spécifie code, is called an "instruction.” 

The solution of a problem consists of performing a sequence of 
instructions. These instructions constitute the program for the computation 
and usually they are also stored in the memory unit. 

A computing program, i.e., a set of instructions effecting the sequence 
of arithmetic operations necessary for the solution of the problem, is 
prepared by mathematicians in advance. 

Many problems require for their solution several hundred million 
arithmetic operations. So in electronic machines we use methods which 
allow a comparatively small number of initial instructions to govern a 
large number of arithmetic operations. 

Together with the instructions governing arithmetic operations, elec¬ 
tronic computers also provide for instructions governing logical opera¬ 
tions; such a logical operation may consist, for example, of the comparison 
of two numbers with the purpose of choosing one of two possible further 
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courses for the computation, depending on which of the two numbers 
is the larger. 

The instructions of a program and also the initial data are written in 
terms of a prearranged code. Usually the description of the instruction 
is recorded on perforated cards or tape in the form of punched holes or 
else on magnetic tape in the form of puises. Then these codes are in- 
troduced into the machine and placed in the memory unit, after which 
the machine automatically carries out the given program. 

The results of the computation are again recorded, for example in the 
form of puises on a magnetic tape. Spécial decoding and printing units 
translate the magnetic tape code into ordinary digits and print them in 
the form of a table. 

The speed with which computers perform the most complicated 
calculations has produced a saving of mental labor which can only be 
compared with the saving in physical labor made possible by modem 
machinery. Of course, an electronic machine only carries out a program 
set up by its operator; it does not itself hâve any créative possibilities 
and cannot be expected to replace a human being. 

The wide use of electronic computing machines in institutes of science 
and technology, in construction offices, and in planning organizations 
has opened up limitless possibilities in the solution of problems in the 
national economy. Engineers and mathematicians hâve before them 
rewarding prospects for further development in the operation and con¬ 
struction of computing machines and also in their application and 
exploitation. 

Electronic computing machines are powerful tools in human hands. 
The significance of these machines for the national economy can hardly be 
overestimated. 


§2. Programming and Coding for High-Speed Electronic Machines 

The basic principles of programming; 1. Euler’s method for differential 
équations. For computations on electronic machines the mathematical 
method selected for approximating the solution of a problem necessarily 
consists of a sequence of arithmetic operations. The execution of these 
operations by the machine is guaranteed by the program, which as we 
hâve said, consists of a sequence of instructions. Of course, if we were 
required to give a separate instruction for each one of the arithmetical 
operations, the program would be very lengthy and even to describe it 
would take about as much time as performing the operations themselves 
by hand. Thus in programming we must try to make a small number of 
instructions suffice for a large number of arithmetic operations. 
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To clarify the structure of a sequence of instructions and the methods 
of setting up a program, let us first examine the operations that must be 
performed when a very simple problem is solved by hand. 

We will take as an example the solution by Euler’s method of the 
following differential équation of the first order with the given initial 
conditions 

Tr = ay ' y\*. = - v °- o 


In this method the range of values of x is divided up into a sequence 
ofintervalsofequal lengthzljir = h, and within each interval the dérivative 
dy/dx is regarded as a constant, equal to its value at the beginning of the 
interval.* With these assumptions the computation for the Arth interval 
is given by the formulas 



yk + 1 = yu + 4 Vk, 


x k+l — x k + h. 


After carrying out the calculation for the A:th interval, we go on to the 
(k + l)th interval. The computation begins with the given initial values 
x 0 and y 0 . The sequence of operations is shown in Table I. 

In hand computations only the first three operations are performed, 
the others being understood but not written down ; this is true, for example, 
of the instruction to begin over again for the following interval, to end 
the computation, and so forth. In machine computation ail these 
operations must be exactly formulated (operations 4-7). Consequently, 
in the machine, in addition to the arithmetic operations, we must also 
arrange in advance for the control operations (operations 4-7). The control 
operations hâve either a completely definite character (for example, 
operations 4 and 5) or a conditional character, which dépends on the 
resuit just produced (for example, operations 6 and 7). Since the last two 
operations are mutually exclusive (we must perform either one or the 
other of them), these two operations are combined in the machine into 
one (a comparison operation), which is formulated in the following way: 
“If x is less than x n , repeat the operations beginning with number I; 
but if x is equal to or greater than x „, stop the computation.” In this 


* ln practice the solution of an ordinary differential équation is usually calculated 
by a more complicated and exact formula. 
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Table 1. Operations Necessary for the Solution of Equation (1) by 

Euler’s Method 


Number of 
the Operation 

Quantity 

Defined 

Formula 

Computations* 

1 


< .ah)y„ 

(aA)( 2).-, 

2 

yt- 1 

y t + ày k 

(2)»., + (1), 

3 

^*♦1 

X t + h 

(3),-, + h 

4 



Print the value found for x ttl . 

5 



Print the value found for y kkl . 

6 



Repeat the computation, beginning 




with operation no. 1 for the new 




values of x and y. 

7 



When x reaches the value x „, stop 




the computation. 


way, the sequence of further computations dépends on the magnitude of 
the x already produced in the process of computing. 


2. The three-address System. A glancc at Table 1 shows that to 
perform any arithmetic operation it is necessary to indicate: First, which 
operation (addition, multiplication, etc.) is to be performed; second, 
which numbers is it to be performed on; and third, where to put the 
resuit, since it is to be used in further computation. 

The code expressions for the numbers are stored in the memory unit 
of the machine; consequently the indexes of the corresponding locations 
in the memory must be given: namely, where the numbers are to be 
taken from and where the resuit is to be placed. This leads to the most 
natural “three-address System of instructions.” 

In the three-address System, a spécifie set of locations in the code is 
assigned to defining the operations; i.e., to stating which operation is 
to be performed on the given two numbers (the code of operations). 
The remaining locations in the instruction code are divided into three 
equal groups, called “instruction addresses” (figure 2). The code in the 


* The digits (with subscripts) in parenthèses in the column "Computations" indicate 
the operation whose resuit is to be used in the computation. For example, in the first 
operation (the first row) we hâve to multiply the quantity (ah) by the quantity found 
as a resuit of performing the second operation (the second row for the preceding interval 
(2) t _,; in the second operation we hâve to add the quantity resulting from the opera¬ 
tion for the preceding interval (2)»-, to the quantity resulting from the first operation 
for the présent interval (l) t . 

At the beginning of the computation the initial data x 0 and y„ are placed in the 
column “Quantity Defined” for operations 2 and 3. 
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first address shows the index of the location in the memory unit from 
which the first number is to be taken, the second address code is the 
index of the location from which the second number is to be taken, and 


Code of the 



Operation 

lst Address 

2nd Address 3rd Address 


Fig. 2. The structure of a three-address System of instructions. 


the third address code is the index of the location of the memory unit 
in which the resuit is to be placed. 

Code expressions for instructions referring to the control unit may also 
be put into the three-address System. Thus, the instruction “transfer a 
number to the print-out unit” must be represented in the code of operations 
by the number assigned to this operation; in the first address will appear 
the index of the location in the memory unit where the number to be 
printed is stored and in the third address the index of the printing unit 
(in the second address the code is blank). An instruction that either one 
course or another is to be followed is called a “comparison instruction.” 
The code of operations of such an instruction States that it is necessary 
to compare two numbers, namely the ones indicated in the first and 
second addresses of the instruction. If the first number is smaller than 
the second, we must pass to the instruction indicated in the third address 
of the comparison command. But if the first number is greater than or 
equal to the second, then the given instruction consists simply of the 
command to pass to the next instruction. 

Instruction codes, as well as number codes, are stored in the memory 
unit and follow one after the other in the ordèr in which they are num- 
bered provided there is no change indicated in the course of the 
computations (for example, by a comparison operation). 

Let us consider how the program will look in the previous example. 
We set up the following distribution of number codes in the locations 
of the memory unit: 

The quantity ah is in the 1 lth location 
The quantity h is in the 12th location 
The quantity x„ is in the 13th location 
The quantity x is in the 14th location 
The quantity y is in the 15th location 
The operative location* is the 16th. 

* A location in which intermediate values found in the course of the computation 
are placed is called operative. 
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Corresponding to the preceding table we get the following program 
(Table 2). 


Table 2. Program for the Solution of Eqnation (1) by Euler’s Method 


Number 
of the 
Instruction 

Instruction Code 



Code of the 
Operation 

lst 

Address 

2 nd 

Address 

3rd 

Address 

Remarks 

1 

Multiplica¬ 

tion 

11 

15 

16 

Ayn = ( ah)y k 

2 

Addition 

15 

16 

15 

yt *i = y k + Ayk 

3 

Addition 

14 

12 

14 

Xkti — Xk T h 

4 

Print 

14 


1 

Print Xtki in the first 
printing unit 

5 

Print 

15 

— 

2 

Print y kti in the second 
printing unit 

6 

Compare 

14 

13 

i 

If x < x t , retum to 
instruction no. 1; if 
x > x k , pass to the 
following instruction, 
i.e., to instruction 
no. 7. 

7 

Stop 


— 

— 

End of the computa¬ 
tion. 


The instruction code is placed in the memory unit (in Table 2, in the 
lst through 7th locations). In the control unit we then place the instruction 
found in the first location of the memory unit. In obedience to this 
instruction the number in the llth location is multiplied by the number 
in the 15th; i.e., the quantity Ay k = ( ah)y k is computed. The resuit is 
placed in the operative 16th location. With the completion of this operation 
the instruction from the next location of the memory unit, i.e., from the 
second location, enters the control unit. By this instruction the quantity 
y* + , = y k + Ay k is found, and is placed in the 15th memory location; 
i.e., it replaces the previous value of y. Similarly, by the third instruction 
the new value of x is found; the 4th and 5th instructions cause the printing 
of the newly found values of x and y; the 6th instruction defines the 
further course of the computational process. This instruction produces 
a comparison of the number found in the 14th memory location with 
the number in the 13th location, i.e., a comparison of the value x k+l 
which has been produced with the final value x„ . If < x „, the 
computation must be repeated for the next interval; i.e., in the gjven 
example we must return to the first instruction. The index of this instruc- 
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tion, to which we must pass if the first number is less than the second, 
is shown in the third address of the comparison instruction. But if the 
computation has produced a value x t+I ^ x„ , the comparison instruction 
causes passage to the next instruction, i.e., to the 7th, which stops the 
computing process. 

Before beginning the computation, we must introduce in the memory 
unit the instruction codes (in locations 1-7), the code expressions for the 
constants (locations 11-13) and also the initial data, i.e., the values 
x„ and y 0 (in locations 14 and 15). 

After completion of the computation for the first interval, the 14th 
and 15th memory locations will contain, in place of x 0 and y 0 , the 
quantities x, and y 2 , i.e., the values of the variables for the beginning 
of the next interval. In this manner, the computations for the next interval 
will be produced by répétition of the same instruction program. 

The example considered shows that, by carrying out a cyclical répétition 
of a sériés of instructions, we maycarry out a large amount of computation 
with a comparatively small program. The method of cyclical répétition 
of separate parts of a program is widely used in programming the solution 
of problems. 


3. Change of address of instructions. A second widely used method 
that allows one to make essential réductions in the size of a program 
consists of automatically changing the addresses of certain instructions. 
To explain the essence of this method, we take the example of computation 
of the values of a polynomial. 

Let it be required to compute the value of the polynomial 

y = a„x* + o,x s + a 2 x* + a^x 3 + a t x* +- a & x + a e . 

For machine computation this polynomial is more conveniently represent- 
ed in the form 

y = (((((ûo* 4- a,)x + a 2 )x + a 3 )x + a t )x + a 5 )x + a t . 

Let the values of the coefficients a„ , •••, a g be placed in memory locations 
20-26, and the value of x in the 31 st location of the memory unit. The 
program is very easy to construct and is given in Table 3. 

As can be seen, in this program the operations of multiplication and 
addition occur alternately. Ail the multiplication instructions, with the 
exception of the lst, are completely alike: we hâve to multiply the number 
found in the 27th location by the number found in the 3lst and put the 
resuit in the 27th. AU the addition instructions hâve the same lst and 
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3rd address. But the index of the location in the second address, in chang- 
ing from one instruction of addition to the next, is increased each time 
by one: in the second instruction the number is found in the 21 st location, 
in the fourth instruction in the 22nd, and so forth. 

The computing program may be essentially shortened, if we arrange 
for an automatic change in the indexes (giving the memory location) in 
the second address of the addition instruction. The instruction codes are 
stored in the corresponding locations and they may themselves be con- 
sidered as certain numbers. By the addition of suitable numbers to them, 
we can make an automatic change in the instruction addresses. In such 
a method the program for computing the values of a polynomial will 
hâve the form given in Table 4. 


Table 4. Program for Computing a Polynomial 


Number of j 
the Instruction 

Instruction Code 

Code of the 
Operation 

Ist 

Address 

2nd 

Address 

3rd 

Address 

1 

Addition 

20 

_ 

27 

2 

Multiplication 

27 

31 

27 

3 

Addition 

27 

21 

27 

4 

Addition 

3 

28 

3 

5 

Comparison 

3 

29 

2 

6 

Stop 


1 



The first instruction serves to transfer the number from the 20th location 
to the 27th in order to hâve the multiplication instruction in standard 
form. In performing the 2nd and 3rd instructions, we get the values of 
a „x + a,. For further computation it is necessary as a preliminary to 
change by 1 the second address in the addition instruction (the 3rd 
instruction), and this change is made by the 4th instruction. According 
to this instruction we take the number found in the 3rd location, i.e., 
the addition instruction in question (the 3rd instruction) and add to it 
the quantity found in the 28th location. In order to change by 1 the 
2nd address of the 3rd instruction, the 28th memory location must 
contain the following: 

Code of the Ist 2nd 3rd 

Operation Address Address Address 
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After performing the instruction in this way, we hâve put the 3rd instruc¬ 
tion into the following form: 

Code of the lst 2nd 3rd 

Operation Address Address Address 

Addition 27 22 27 

This new form is stored in the 3rd memory location in place of the previous 
form of the addition instruction. 

Having obtained this new form by the addition instruction, we may 
repeat the computations, beginning with the multiplication instruction, 
i.e., with the 2nd instruction. The 5th comparison serves for this purpose. 
This instruction compares the newly found instruction in the 3rd location 
with the quantity stored in the 29th location. In the 29th location is 
stored the following: 

Code of the lst 2nd 3rd 

Operation Address Address Address 

Addition 27 27 27 

This comparison initially tells us that the first quantity (in the third 
location) is less than the second (in the 29th location), and so the process 
of computation passes to the 2nd instruction, shown in the 3rd address 
of the comparison instruction. Thus the multiplication instruction (the 
2nd instruction) and the addition instruction (the 3rd instruction) will be 
automatically repeated, and each time the number of the location in the 
2nd address of the addition instruction will be changed by one (as arranged 
for by the 4th instruction). 

Répétition of the cycle will continue until the 2nd address of the 
addition instruction (the 3rd instruction) reaches the magnitude 27, 
which happens after six répétitions of the cycle. Here the 3rd instruction 
will hâve the form: 

Code of the lst 2nd 3rd 

Operation Address Address Address 

Addition 27 27 27 

i.e., the instruction code will be the same as in the 29th location. The 
comparison instruction (the 5th instruction) takes note at this stage of 
the equality of the quantities found in the 3rd and 29th location, so that 
the process of computation passes to the next instruction, i.e., the 6th, 
and herewith the computation of the polynomial is finished. 
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The method of automatically changing, as part of the program itself, 
the number of the location in the addresses of certain instructions is 
widely applied for the solution of many different problems. Together with 
the method of cyclic répétitions, it enables us to perform a very large 
volume of computations with a small number of instructions. 

4. The one-address System. In addition to the three-address System of 
instructions that we hâve considered, in many machines a one-address 
System is uscd. In a one-address System each instruction contains, in 
addition to the code of operation, only one address. Performing an 
arithmetic operation with two numbers and placing the resuit in the 
memory unit calls for three instructions: The first instruction puts one 
of the numbers of the memory unit into the arithmetic unit, the second 
puts in the second number and performs the given operation with the 
numbers, the third places the resuit in the memory unit. In the course 
of any computation, the resuit produced is often used only to perform 
the next following arithmetic operation. In these cases one does not 
need to put the resuit obtained into the memory unit, and for the per¬ 
formance of the following operation one does not need to recall the first 
number. Thus the number of instructions in a program with a one-address 
System is found to be roughly only twice as large as for a three-address 
System. Since a one-address instruction needs a smaller number of locations 
than a three-address System, the amount of space taken up in the memory 
unit by the program will be about the same for both Systems of instructions 
(usually in a one-address System of instructions each location of the 
memory unit will contain two instructions). The différences in the two 
different Systems of instructions must be taken into account in making 
a comparison of the rapidity of working of the machines. For the same 
rapidity of performing an operation, a one-address machine will perform 
computations about twice as slowly as a three-address machine. 

In addition to these Systems, certain machines hâve a two-address or a 
four-address System of instructions. 

5. Subroutines. Usually the solution of a problem is carried out in 
several stages. Many of these stages are common to a sériés of problems. 
Examples of such stages are: computing the value of an elementary 
function for a given argument, or determining the definite intégral of a 
function already computed. 

Naturally it is désirable for such typical stages to hâve standard 
subroutines worked out once and for ail. If in the course of the solution 
of a problem we are required to carry out standard computations, we 
should transfer the computation at the appropriate moment to one of 
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the standard subroutines. Then at the end of the computations involved 
in the subroutine, it is necessary to retum to the basic program at the 
place where it was interrupted. 

The existence of standard programs makes the task of the programmer 
considerably easier. With a library of such subroutines, recorded either 
on punched cards or on magnetic tape, the programming of many 
problems consists simply of setting up some short parts of the basic 
program linking together a sequence of standard subroutines. 

6. Vérification of results. On electronic computing machines problems 
are solved that require several million arithmetic operations. An error in 
even one of the operations may lead to incorrect results. Of course, it is 
practically impossible to set up a check System by hand over such a large 
number of computations. Thus the checks and vérifications must be 
carried out by the machine itself. Apparatus exists that will verify the 
correctness of the machine’s operations and bring it to an automatic 
stop if an error is discovered. However, this apparatus involves a con¬ 
sidérable increase in the size and complexity of the machine and usually 
does not act on ail its parts. More promising are the methods of vérification 
that are included in advance in the program itself. 

One such method of vérification consists simply of répétition of the 
computation, as is so common in hand computation under the name of 
“duplication check.” If an independent répétition of the computation 
produces the same results, we may be sure that there are no random 
errors but this method will naturally fail to reveal the présence of sys- 
tematic errors. To exclude the latter we must carry out in advance some 
control computations with previously known answers, and these computa¬ 
tions must involve ail parts of the machine. Correctness of the results 
produced in the control computations serves to guarantee the absence of 
systematic errors. 

In addition to this “duplication check,” we may apply morecomplicated 
methods of vérification, depending on the type of problem. For example, 
in calculating the trajectory of a projectile, we may first solve the System 
of differential équations for the two components of the velocity and then 
subsequently solve the single differential équation for the total velocity 
and at each step of the intégration verify the formula: 

v* == v* x + e« . 

For the solution of ordinary differential équations, in addition to the 
computation with steps of intégration h, we may carry out a second 
computation with steps h/2. This will not only guarantee the absence of 
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random errors in the computation but also will give an estimate of the 
validity of the choice of step size. In computing a table by a récurrence 
formula, we may sometimes compute certain key values by other methods. 
A correct resuit for the key values is a sufficient guarantee of the correct- 
ness of ail intermediate values. In some cases vérification may consist 
of noting the différences between the results produced. 

In constructing a program it is necessary to provide in adVance for 
some form of logical vérification of the results obtained. 


Coding of numbers and instructions. Numbers and instructions are 
placed in machines in the form of codes. In most cases the binary System 
of notation is used instead of the ordinary décimal System. 

In the décimal System the number 10 is taken as the base. The digits 
in each position may take one of the ten values from 0 through 9. The 
unit in each successive position is ten times as large as the unit in the 
preceding position. Consequently, an integer in the décimal System may be 
written 

N l0 = *„10» + *,10' + *,10* + - + *„10", 

where * 0 , *,, ••• , *„ may take the values from 0 through 9. 

In the binary System the number 2 is taken as the base. The digits in 
each position may take only the two values 0 and 1. A unit in each 
successive position is twice as large as a unit in the preceding position. 
Consequently, an integer in the binary System may be written 


N t = *02° + *,2' + - + * p 2”, 


where * 0 * P may take the values 0 or 1. 

The first few natural numbers in the binary and the décimal System 
are written, 

Binarysystem 0 1 10 II 100 101 MO III 1000 1001 1010 1011 etc. 

Décimal System 0 1234567 8 9 10 11 etc. 

A noninteger is written analogously in terms of négative powers of the 
base. For example, 3^ is written in the binary System as 

11 . 001 . 


The transfer of numbers from one System of notation to another 
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involves spécifie arithmetic operations that are usually carried out in the 
electronic computing machine itself by spécial programs. 

Arithmetic operations on numbers in the binary System are carried out 
in exactly the same way as in the décimal System. Here the addition of 
two units in any position produces zéro in the given position and carries 
one to the following position. For example, 

1010 + 111 = 10001 . 

Multiplication and division in the binary System are simpler than in the 
décimal System, since the multiplication table is replaced by the rules 
for multiplying by 0 and 1. For example, 

1010(10) 
x 101 (5) 

1010 

0000 

1010 

110010(50) 

The choice of the binary System of notation in the majority of electronic 
computing machines is because the arithmetic unit is thereby greatly 
simplified (generally at the expense of brevity in the operations of mul¬ 
tiplication and division) and also the digits in each position are con- 
veniently represented, for example, by open or closed relays, the presence 
or absence of a signal in a circuit, and so forth (in the binary System 
the digits in each position can only hâve the two values: 0 or 1). 

Every digit of a binary number may be represented in the form of the 
presence or absence of a signal in its circuit, or in the State of a relay. 
In this case it is necessary that every digit hâve its own circuit or relay 
(figure 3) and the number of such circuits will be equal to the number 
of digits (parallel System). A binary number may also be represented in 
the form of a time-pulse code. In this case each digit of a number is 
represented at spécifie intervals of time on one circuit (sériés System). 
The time intervals for each digit are created by synchronizing puises, 
common to the entire machine. 

Corresponding to these two principles, the methods of coding a number 
for an electronic computing machine fall into two categories: one for a 
machine with parallel operation and the other for a machine with sériés 
operation. In a machine with parallel operation ail the digits of a number 
are transmitted at the same time and each digit requires its own circuit. 
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A 


A 


The number 10010111 


1 


A 


i 


(a) 


A 


A^AAA 


In a machine with sériés operation the number is transmitted by one 
circuit, but the time of the 
transmission is proportional to 
the number of digits. Thus 
machines with parallel opera¬ 
tion are faster than machines 
with sériés operation, but they 
also require more apparatus. 

Every electronic computing 
machine has a spécifie number 
of places for digits. AU numbers 
to be dealt with in a computa¬ 
tion must be included in that 
number of places, and the posi¬ 
tion of the décimal point, 
separating the integer part from 
the fractional, must naturally 
be included. 

In certain machines the posi¬ 
tion of the décimal point is 
rigidly fixed; these are the so- 
called “fixed-point” machines. 

Usually the décimal point is 
put before the first place; i.e., 
ail the numbers for the com¬ 
putation must be less than one, which is guaranteed by the choice of a 
suitable scale. For complicated computations it is difficult to détermine 
in advance the range of the results to be expected, and thus we hâve 
to choose the scale so as to hâve something in reserve, a procedure which 
lowers the accuracy, or else we must arrange in the program itself for 
an automatic change of scale, which complicates the programming. 

In certain machines the position of the décimal point is indicated for 
each number; these are machines which keep track of the exponents and 
they are usually called “floating-point” machines. Indicating the position 
of the décimal point is équivalent to representing the number in the form 
of its sequence of digits and its exponent, i.e.. 


A. .A .A A A A A A , 

(b) 

Fig. 3. Code Systems: 

(a) parallel; (b) sériés; 

(1 is the code; 2 is the synchronizing puise) 


jV| 0 = 10*yV|' o in the décimal System, 
Ni = 2“N' 2 in the binary System. 


Thus the number 97.35 may be represented as 10 2 • 0.9735. To represent 
the number in a machine we must indicate both its exponent (p or k) 



350 


XIV. ELECTRONIC COMPUTING MACHINES 


and its sequence of digits. Thus ali the digits in the number are made 
use of independently of its size; i.e., every number is represented by its 
entire set of significant digits with the same relative error. This increases 
the accuracy of the computation, especially for multiplication, so that 
in most cases one can dispense with a spécial choice of scale. 

Increased accuracy and simplified programming in the floating-point 
machines are attained at the expense of some complication in the arith- 
metic unit, particularly in the operations of addition and subtraction. 
Since numbers may initially hâve different exponents, it is necessary to 
provide them with the same exponents before adding or subtracting them, 
in which process the final digits of the smaller number are discarded, thus: 

10* • 0.7587 + 10° • 0.3743 = 10* • 0.7587 + 10* • 0.0037 = 10* • 0.7624. 

The code for a number in the binary System for a fixed-point machine 
consists simply of its sequences of digits (the number is assumed to be 
less than one); for example: 

.00110110000000 = ~. 

128 


In floating-point machines a spécifie part of the code describes the 
exponent, which is also coded in the binary System. An example of the 
way in which a number is expressed in such a code is 



2 * • ^ = 0011 . 11011000000 . 
J*» 


In addition, it is customary to reserve two places for the algebraic sign 
(for example, in the form 0 or “—” in the form 1), one for the sign 
of the exponent and one for the sign of the number itself. 

Instructions are coded the same way as numbers are, a spécifie part 
of the code being allotted to expressing the index (in the binary System) 
of the operation and another to the indexes of the memory location of 
each address. 


§3. Technical Principles of the Various Units of a 
High-Speed Computing Machine 

The order of performing the operations in electronic computing machines. 

The performance of each arithmetic operation in a machine in accordance 
with a given list of instructions may be reduced to the following successive 
steps (it is understood that we are talking about a three-address System 
of instructions). 
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1. Transfer of the first number from the memory unit to the arithmetic 
unit (the location of this number in the memory unit is given in the first 
address of the instruction code). 

2. Transfer of the second number from the memory unit to the arith¬ 
metic unit (its location is given in the second address of the instruction 
code). 

3. Performance by the arithmetic unit of the given operation on these 
numbers in accordance with the operation code. 

4. Transfer of the resuit from the arithmetic unit to the corresponding 
location in the memory unit (the index of this location is given in the 
third address of the instruction code). 

5. Sélection from the memory unit of the next instruction, whereupon 
the machine begins to carry out the next operation. 

In the machine the instruction code is accepted in the “instruction 
memory block” (IMB, figure 4). An electronic commutator (EQ trans- 


CBB 



Fig. 4. Structural diagram of an electronic digital computer. 

forms the binary number of the operation code into an activating voltage 
in one of its output circuits corresponding to the given arithmetic oper¬ 
ation. This voltage through the control unit (CU) préparés the circuits of 
the machine to perform the required operation. 

In order to select the first number, the first address code of the instruc¬ 
tion (Al), is transferred via the address code bus bars (ACBB) from the 
instruction memory block (IMB) to the control memory block (CMB). 
The signal for the transfer of this code is given by the control unit (CU) 
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of the machine. From the location the memory unit (MU) which cor¬ 
responds to the code number transmitted the first number is selected and 
via the code bus bars (CBB) is placed in the arithmetic unit (AU). The 
opening of the input circuits in the arithmetic unit is effected by a cor- 
responding signal from the control unit (CU) of the machine. 

The second number is selected in a similar manner. A signal from the 
control unit (CU) of the machine transfers the code of the second instruc¬ 
tion address (A2) from the instruction memory block (IMB) to the 
control memory block (CM B). The second number, taken in this way 
from the memory unit (MU), is transferred via the code bus bars (CBB) 
into the arithmetic unit (AU). 

The arithmetic unit (AU) performs the given operation with the numbers 
in accordance with the operation code inserted in it previously. 

In order to effect the transfer of the resuit thus obtained into the 
memory unit the third address code of the instruction (A3) is transferred 
via the address code bus bars (ACBB) from the instruction memory 
block (IMB) to the control memory block (CMB). The signal for the 
transfer of this code is given by the control unit (CU) of the machine. 
The memory location corresponding to the number thus obtained is then 
selected and its input circuits are opened. The rules for the sélection or 
insertion of numbers are given by signais from the control unit (CU) of 
the machine. The signal from the control unit (CU) of the machine 
transfers the resuit obtained from the arithmetic unit (AU) to the code 
bus bars (CBB), via which the number is placed in the chosen location 
of the memory unit. 

The instruction control block (ICB) is provided for the sélection of the 
instructions. In this block is given the number of the chosen instruction. 
Usually the instructions go in numerical order so that, to give the number 
of the following instruction, it is necessary that the number found in the 
instruction control block (ICB) be increased by one. This is done by the 
control unit of the machine (circuit 4- 1). The instructions are stored in 
the memory unit. For sélection of the next instruction the newly obtained 
number is transferred via the address code bus bars (ACBB) from the 
instruction control block (ICB) to the control memory block (CMB). 
The signal for this transfer cornes from the control unit of the machine 
(CU). The new instruction taken from the memory unit (MU) is transferred 
via the code bus bars (CBB) into the instruction memory block (IMB), 
the output circuits of which are opened by a signal from the control 
block of the machine. This concludes one cycle of the operation of the 
machine. In the next cycle the machine performs the newly received 
instructions. The normal succession of instructions in numerical order 
may be altered by performing a control operation; for example, a corn- 
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parison instruction. This instruction does not call for any arithmetic 
operation but spécifiés the course of the computational process. If the 
first number is less than the second, then it is necessary to go over to the 
instruction whose number is shown in the third address. But if the first 
number is greater than or equal to the second, then we pass on to the 
next instruction. 

In transferring the comparison instruction code to the instruction 
memory block (IMB) an electronic commutator (EC) transforms the 
binary number of the operation code to an activating voltage in that one 
of its output circuits which corresponds to this operation. This voltage 
préparés the circuits of the machine for performing the operation of 
comparison. 

The sélection from the memory unit of the two numbers whose locations 
are given in the first and second addresses of the comparison instruction 
is carried out in exactly the same way as an arithmetic operation. The 
comparison of the numbers in the arithmetic unit (AU) may be carried 
out by subtracting the second number from the first. Depending on the 
sign of the resuit the control unit (CU) either transfers the code number 
of the next command from the third address (A3) via the address code bus 
bar (ACBB) to the instruction control block (ICB), or adds one to the 
number which is found in this block (circuit + 1), exactly as in performing 
an arithmetic operation. After the number of the next command has been 
placed in the instruction control block (ICB), its sélection from the 
memory unit is effected in the same way as in an arithmetic operation. 

The arithmetic unit and the control unit. Electronic computing ma¬ 
chines make use of present-day devices for electronic automatization. 
Basically the units of the machine work on the crude principle of “yes” 
or “no”; i.e., essentially there either is a signal or the signal is absent. 
Consequently, we may vary the parameters of an electronic circuit rather 
widely without affecting the operation of the machine. 

One of the most widely used éléments applied in electronic machines 
is the flip-flop or trigger cell. The simplest flip-flop (figure 5) consists of 
two amplifiers with plate resistors R a , connected by the divider resistors 
/?, and /? 2 . The bias established (0 B ) is chosen so that one of the tubes 
opérâtes and the other does not. Since the two halves of the circuit are 
symmetric, either tube may be closed; i.e., the circuit has two stable 
positions of equilibrium. In fact, if the left tube is closed, and the right 
one is open, then on the plate of the left tube (O,) there will be a high 
voltage, and on the plate of the right tube (O 0 ) a low voltage (because 
of the voltage drop on the plate résistance R a from the current through 
the tube. These voltages are transferred through the divider resistors /?, 
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and R z to the grids of the opposite tube, and consequently there will be 
a small voltage on the grid of the left tube and a high voltage on the gnd 
of the right tube. With a proper choice of the parameters of the circuit, 
these grid voltages will keep the tubes in the given State. 



Fig. 5. The circuit of a flip-flop. 


Similarly, if the left tube is open and the right one closed, there will be 
low voltages on the plate of the left tube and on the grid of the right 
tube and high voltages on the plate of the right tube and on the grid of 
the left tube. 

The flipping of a flip-flop from one State to the other may be brought 
about by négative puises placed on the grids of the tubes through diodes. 
If we place a négative puise on the grid of the left tube, then the left tube 
is closed, and its plate voltage will increase. This produces a higher voltage 
on the grid of the right tube, which opens the right tube. In this manner, 
the trigger assumes the first position of equilibrium (high voltage on the 
plate of the left tube). But if a négative puise is placed on the grid of the 
right tube, the flip-flop assumes the second stable equilibrium position 
(a high voltage on the plate of the right tube). If a négative puise is placed 
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simultaneously on the grids of both tubes, then each such puise will 
cause the flip-flop to move from one State of equilibrium to the other. 

If we consider the circuits by which puises are placed on the grids of 
the tubes as inputs of the System and the plate voltages as outputs, we 
hâve the diagram in figure 6 for the operation of a flip-flop. 

The properties of flip-flops make them convenient for use in the various 
units of an electronic computing machine. To one equilibrium State of the 
flip-flop we may assign the code value “0,” for example, to high voltage 
on the right output (O 0 )—and to the other the code value “1,” high 
voltage at the left output (O,). Correspondingly, the inputs may be 
denoted by and I c (the counting input). 

Flip-flops are used in electronic machines for the temporary storage 
of codes (receiving registers) (figure 7). Initially ail the flip-flops are set 


1 


0 




'e 


Fig. 7. Diagram of a receiving register of flip-flops. 


in the code position “0” by means of négative puises (/ E ) on the zéro 
inputs of ail cells. The code of a number or of an instruction is placed 
on the unit inputs of the flip-flops in the form of négative puises. In 
those positions in which there are code puises the flip-flops pass to the 
code position “1” and hold this position until they receive an extinguishing 
puise (/ E ). Receiving registers are used in the arithmetic units for storing 
the code of an instruction, for giving the number of a required location 
of the memory unit, and so forth. 

A second realm of application of flip-flops is in addition circuits. Here 
use is made of the property of a flip-flop that it changes its State of 
equilibrium every time a négative puise is applied to the counting input 
(simultaneously to both inputs). If the flip-flop starts in the code position 
“0,” then the application of a puise moves it into code position “1.” 
But if the flip-flop starts in code position “1,” then the application of a 
puise moves it to code position “0.” In the absence of a puise the flip- 
flop remains in its previous position. The initial position of the flip-flop 
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may be considered as a code for a given digit of the first position of the 
second number. Here it is easy to see that the behavior of the flip-flop 
exactly corresponds to the rules of addition of binary numbers for one 
digit (0 + 0 = 0; 0 + 1 = 1; 1 + 0 = I; I + 1 = 10, i.e., “0” in the 
given position and the carrying of “1” to the next position). In order 
that the addition circuit may work for several binary digits, it is necessary 
to guarantee the carry from one digital position to the next. A carry in 
the original position is caused by the addition of two units, i.e., by the 
passage of the flip-flop from the code position “1” to the code position 
“0.” In this passage the voltage on the left output of the trigger is changed 

through a circuit containing 
a condenser and a resistor, 
then in leaving the circuit 
it causes a négative puise. 
Through a delay line this 
carry puise may be directed 
into the counting input of 
the next position. 

Figure 8 represents the 
simplest addition circuit with 
flip-flops. Initially ail the 
flip-flops are set in the code 
position “0" by a puise /„ 
placed on their zéro inputs. 
On réception of the code of the first number, which appears in the form 
of négative puises on the counting inputs, the flip-flops assume a position 
corresponding to the code of the first number. On réception of the code 
of the second number, there occurs digit-by-digit addition of the binary 
numbers, and in those positions where the addition has produced two 
ones, there arise carry puises that after a time delay ta are applied to the 
counting inputs of the flip-flops in the higher positions. These carry puises 
may move the flip-flops from the code position “1” to the code position 
“0.” In this case there arises a carry puise to the next higher position. 
In the worst case, when in the addition of the codes ail the positions are 
set in the code position “1,” and the lowest position passes from code 
position “1” to code position “0,” the carry puise arises successively in 
each position after a time delay ta . In this manner, the total time required 
for the passage of the carry puises will be equal to one time delay mul- 
tiplied by the number of positions. More complicated electronic circuits 
of flip-flops allow the élimination of such step-by-step carries with con¬ 
séquent shortening of the time required for addition. 

For multiplication of numbers an arithmetic unit of flip-flops (figure 9) 


from high to low. If this voltage is passed 



Fie. 8. Addition circuit 
with flip-flops. 
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has two receiving registers for storage of the multiplicand and the 
multiplier (R,, R 2 ) and with them an adder (Add). Multiplication is 
carried out in the follow- 
ing manner. The code of 
the multiplier is shifted 
one place to the right. If 
in the lowest place the 
multiplier has the code 
“1,” then in the right 
output of the register of 
the multiplier there arises 
a puise that is applied to 
the circuits governing the 
application of the code in 
the multiplicand register 
to the adder (the circuit 
+ N). After this has been 
done the partial product 
in the adder is moved one place to the right and the operations are 
repeated. In this manner the sum of the partial products is accumulated 
in the adder. These operations are repeated as many times as there are 
digit positions in the number codes. In the multiplication of two numbers 
each of which takes up "n" positions, the product will take up “2 n" 
positions. The highest “ n" positions of the product are distributed in 
the adder, and the lowest “n” positions of the product may be entered 
one after the other, as the shifts to the right successively set free the 
positions in the register of the multiplier. With the completion of the 
multiplication, the lowest “ n" digits of the product are placed in the 
multiplier register. The time required for multiplication is roughly equal 
to the time required for addition multiplied by the number of digit 
positions in the number code. 

A code shift with flip-flops is produced by the circuit illustrated in 
figure 10. Applying the shift puise (/„h) to the zéro inputs of ail the flip- 
flops places them in code position “0.” From these flip-flops which are 
in the code position “1,” carry puises arise which put the adjacent flip- 



Fig. 9. Multiplication circuit 
with flip-flops. 



Fig. 10. Circuit for shifting a code with flip-flops. 
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flops into code position “1” with a time delay ta- In this way, every 
application of a carry puise moves the code one place. 

An arithmetic unit with flip-flops which consists of two receiving 
registers and an adder also enables us to divide one number by another. 

Usually an arithmetic unit with flip-flops is constructed so as to serve 
in a universal way for ail the arithmetic and logical operations. 

Flip-flops are also used in electronic machines for counting puises, 
which is necessary in a number of different control arrangements. The 
circuit for an electronic counter (figure 11) diflers from the circuit for 



Fig. 1 1. Circuit of an electronic computer 
with flip-flops. 

an elementary adder (figure 8) only in the omission of the delay line in 
the carry puise links. A counter of this sort can count up to 2" puises 
(n is the number of places in the counter), after which the position of 
the counter is repeated. At the cost of some complication in the System 
it is possible to construct an electronic counter for an arbitrary number 
of puises (not equal to 2"). 

For the realization of logical operations and control circuits in electronic 
computing machines, we make use of coincidence units (the so-called 
“AND" éléments), of inverters, and of divider diode links (“OR” élé¬ 
ments). 

The AND éléments work on the logical principle of “both—and” 
(“one and also the other’’); i.e., at the output of such a unit a signal 
will occur only in case there are signais at ail inputs. Inverters work on 
the logical principle of “yes—no”; i.e., if there is a signal on an input, 
then there will be no signal at the output, and conversely, when there 
is no signal on the input, then there is an output signal. The OR éléments 
obey the logical law “either—or”; i.e., at the output there will be a 
signal in the case when there is a signal at any one input. 

AND éléments are widely used for “channeling” electric signais in 
a machine, i.e., for directing signais to the required circuits. For example, 
figure 12 illustrâtes a code bus bar for one of the digits of a number. This 
code bus bar is joined through an AND element to the inputs and outputs 
of the locations of the memory unit, to the inputs of two receiving registers 
of the arithmetic organ and to the output of an adder. Applying a control 
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signal to the output of an AND element of any location of the memory 
unit, we thereby put the code stored in this location onto the code bus 
bar. If we simultaneously put a control signal on the input AND éléments 
of the first receiving register, for example, then the code on the bus bar 
is entered into the first register. Similarly, if we 



CBB 


put a control signal on the output AND unitsof 
the adder, then the code which is produced in the 
adder is transferred to the code bus bar. If here 
we place a control on the AND-circuit inputs of 
any location of the memory unit, then the codes 
being transferred by the code bus bars will be 
received in this location. Of course, before 
receiving codes in locations of the memory unit 
or in the receiving registers of the arithmetic 
unit, it is necessary to clear the codes which 
were in them previously. 

This example does not exhaust ail the various 
applications of AND éléments for channelling 
electric signais in an electronic computing 
machine. They also are widely applied in the 



Fig. 12. Channeling Fig. 13. Circuit of an 

of signais by an AND electronic commutator with 

element. four output links. 


memory unit, in the arithmetic unit, and in the control unit of the 
machine. 

In addition to solving problems of channelling signais, AND éléments 
perform more complicated functions. For example, when we are having 
access to the location of the memory unit, there often arises the problem 
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of converting the number of the location (given in binary form) to a 
control voltage placed on this location. This problem is handled by the 
electronic commutator, constructed from AND éléments. Figure 13 
illustrâtes a circuit for an electronic commutator with four output links. 
The number of the location is given in the form of a binary code on 
two flip-flops. Ail four possible combinations of the State of these 
locations are given in Table 5. 


Table 5. 
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If in an AND element the high voltage is controlling, then to get a 
signal on the zero-output link it is necessary that the inputs of the AND 
éléments be connected to the right outputs of the first and second flip- 
flops. In this case on the output of this AND element, there will be a 
signal only when the flip-flops are found in the code position “00.” 
Similarly, to get a signal on the first output link (the code “01"), the 
inputs of the corresponding AND element must be connected to the left 
output of the first flip-flop and to the right output of the second flip-flop. 
The connections of the AND éléments for the second (code “10”) and 
third (code “11") links will also be made on the same principle. 

In a number of cases the AND éléments together with inverters and 
OR éléments are used in the construction of the arithmetic units. For 
digit-by-digit addition of numbers with two binary digits, we hâve the 
four possible combinations in Table 6. 
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These relations may be realized, for example, by the circuit shown in 
figure 14. Such circuits are called “semiadders.” The carry signal for the 
higher of the two positions is 
produced by an AND element 
(combination 4). To get the 
signal of the sum (combinations 
2 and 3), it is sufficient to hâve 
a signal on one of the two 
outputs with the absence of an 
output carry signal, which may 
be done by an AND element, 
an inverter, and a diode link 
unifier. For addition of num- 
bers it is necessary to consider 
not only the digits in a given 
position but also the carry from 
the preceding position. The 
carry may be taken as repeated addition to the resuit produced by carrying 
from the previous position. In this manner, the union in sériés of two 
semiadders fully guarantees the addition of one position in two binary 
numbers. 

The circuit of an adder for one position may also be realized directly 
by considering the possible combinations and taking account of carrying 
from the preceding lower position. 

It is most effective to use adder circuits in AND éléments in machines 
with sequential code distribution. In this case the code of a number is 
transferred by one of the code bus bars. The digits of the number follow 
one after another at strictly determined intervals of time. In this case 
for the addition of numbers, we may use a one-place adder (figure 15). 

The codes of both numbers are 
placed in advance in the lowest 
positions on the two basic inputs of 
the one-place adder. The carry out¬ 
put is run through a delay line to 
the third input of the adder. The 
time of the delay is taken equal to 
the interval between puises. In this 
manner, if in the addition of any 
digit of the numbers there occurs a 
carry puise, it is placed in the input 
of the adder at exactly the same time 
as the occurrence of the puises in the next higher position. The time 


Ist number 
2nd number 


1——I 

Fig. 15. Circuit for a sériés 
adder with AND éléments. 
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Fig. 14. Circuit for a 
one-place semiadder. 
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required for addition of two numbers is equal to the time required for 
the passage of the code of one number. 

Multiplication of two numbers in a sériés code may also be done with 
a one-place adder, and here it is necessary to put the numbers through 
the adder a number of times equal to the number of positions occupied 
by the number code, i.e., the time required for multiplication is “n” times 
as long as the time for addition. 

Memory units. The possibilités of a machine are to a great extern 
determined by the capacity of its memory unit, i.e., by the number of 
numbers that can be stored in the machine. For contemporary universal 
electronic computing machines this capacity is usually 500-4,000 numbers. 

For code storage it is possible to use flip-flops. However, the amount 
of apparatus here turns out to be so large that this form of memory unit 
is almost never used. 

For machines with sériés operation, widespread application has been 
found for memory units consisting of electroacoustic mercury tubes 
(figure 16). An electric signal in the form of a puise is placed on a quartz 


AAA—H 



Fig. 16. Basic circuit for dynamic storage of a code 
in an electroacoustic tube: 

(1) mercury tube; (2) transmitting quartz crystal; 

(3) receiving quartz crystal ; (4) transmitted form 
of the puise; (5) reccived form of the puise. 

crystal at the input of the tube. The quartz crystal has the property of 
transforming an electrical puise into a mechanical oscillation, and con- 
versely. In this manner the entering electrical signal is transformed into 
a mechanical (ultrasonic) vibration, which is propagated along the tube 
with a spécifie velocity. When the signal reaches the end of the tube, 
it falls on a receiving quartz crystal and is transformed again into an 
electrical puise. After being amplified and put into its original form, the 
signal is again directed toward the input of the tube. In this manner, 
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the codes of the numbers introduced in the form of puises in the mercury 
tube are circulated indefinitely in the tube. To introduce the numbers 
into the tube, a code from the machine is placed on the input of the tube, 
and simultaneously the circuit for the retum of puises from the end of 
the tube is broken for the same period of time. For the sélection of numbers 
in the corresponding instant of time, when the required code reaches the 
end of the tube, the output links are opened, thereby transmitting the 
code to the other units of the machine. The entry and removal of the 
numbers is accomplished automatically by appropriate electronic circuits. 
Usually, with the goal of simplifying the apparatus, several numbers are 
stored in each mercury tube. Thus for access to a number, it is necessary 
to wait while the required code goes to the end of the tube. The more 
numbers there are stored in the tube, the greater the time required to 
find a required number. 

Sériés machines with memory units composed of electroacoustic 
mercury tubes operate at a rate of 1,000-2,000 operations per second. 

For memory units one often applies the principle of magnetic recording 
of electrical signais, similar to the recording of Sound. The record may be 
made either on a magnetic tape or on a continuously revolving drum 
covered with a ferromagnetic material (figure 17). Along the generator 
of the drum there are placed magnetic heads. If at a spécifie instant of 
time current puises are passed through the windings of the magnetic 
heads, then in the corresponding places on the surface of the drum the 
signais will be recorded in the form of residual magnetization. With the 
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Fig. 17. Basic scheme of a magnetic drum: 

(1) current through the coil; (2) residual magnetization; 
(3) emf in the coil in read-out. 
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rotation of the drum the field resuiting from the residual magnetization, 
passing under the heads, causes in them electric signais, which are 
amplified and transmitted to the other units of the machine. 

A magnetic drum may be used both for a sériés System and for a parallel 
System of transmitting codes. However, the drawback of electroacoustic 
mercury tubes, namely the delay in access to numbers, is even more 
characteristic of the magnetic drum. Thus memory units with magnetic 
drums are used for machines of comparatively low speed (of the order 
of several hundred operations per second). On the other hand, a magnetic 
drum allows a marked increase in the capacity of the memory unit with 
only a tolerable increase in the amount of apparatus. Thus the magnetic 
drum and the magnetic tape are often used in universal machines as com- 
plementary (exterior) memory units in addition to fast-acting (operative) 
memory units. 

In high-speed electronic computing machines with parallel operation, 
cathode-ray tubes are often used for the memory unit (figure 18). If the 
électron beam is directed at any point of the screen, then at this point 
there is accumulated an electric charge. The charge will be preserved for 
a considérable time, so that it is possible to record number codes on the 
screen. In the process of making a computation, a beam of électrons is 
again directed to the required point. If the given element has not been 
charged, it now receives a charge, and through the signal plate and the 
output amplifier there emerges a code puise. But if the element is charged, 
the signal does not emerge. In this way we can détermine whether a 
signal has been recorded at a given point or not. After access to the code 
we must re-establish the previous State of the given element, which is 
done automatically by a spécial circuit. In exactly the same way it is 
necessary to renew the code recordings periodically, in order to avoid an 
essential change in the charge by stray électrons and leakage through the 
dielectric. 



Fig. 18. Basic scheme of a cathode-ray tube: 
(1) source of électrons; (2) deflection plates; 
(3) screen; (4) signal plate. 





§4. PROSPECTS FOR THE DEVELOPMENT AND USE 


365 


Usually there are 1,024 (32 x 32) or 2,048 (36 x 64) points distributed 
over the screen. The direction of the beam of électrons to the required 
point is accomplished by appropriate voltages on two pairs of deflecting 
plates. 

In machines with parallel operation, every digit of a binary number 
requires its own cathode-ray tube and access to number is made simul- 
taneously for ail tubes. The access time, including the entire operation of 
the element, may be reduced to a few microseconds. 

Recently use has been made of memory units with magnetic éléments 
that hâve rectangular hystérésis loops (figure 19). If we put a positive 
signal through the coil, then 
the core is positively mag- 
netized and for a négative 
signal it will be negatively 
magnetized. 

With the removal of the 
signal, the core remains mag¬ 
netized either positively or 
negatively. Thus, the State of 
the core characterizes the signal 
recorded. In the computing 
process, there passes through 
the coil a signal of spécifie 
polarity, for example, a positive 
one. If in this case the core was 
magnetized negatively, then a 
remagnetization will occur (a 
change in the magnetic flux), 
and in the output coil there will be induced an electromotive force, which is 
fed into an amplifier. But if the core was magnetized positively, then a 
change in its State will not take place, and no signal will arise in the output 
coil. In this way it is possible to distinguish which signal has been placed on 
a given element. Of course, after access has been had to the code, it is 
necessary to restore the original State of the core, which is done by a 
spécial circuit. 

§4. Prospects for the Development and Use of 
Electronic Computing Machines 

The use of electronic computing machines will inevitably hâve a great 
influence on the development of many fields in contemporary science and 
technology, especially in the physical and mathematical sciences. Thus it 
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Fig. 19. Basic scheme of a memory 
element with a rectangular 
hystérésis loop: 

(1) input coils; (2) output coils. 
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is appropriate for us to indicate the basic prospects for further application 
of computing machines and their significance for mathematics. 

Further extension of the areas of application of mathematical machines; 
1. Improved machines. At the présent time there is continuous and 
intensive technological progress going on in the production of high-speed 
computing machines, in further improvement of their construction, and 
in the use of new physical principles and of combinations of new types. 
Thus we may expect better technical properties for these machines 
(speed, capacity of the memory, regularity, and reliability of operation), 
and also a notable simplification in their construction and use which will 
guarantee their widespread distribution. 

The diversity of the types of the machines will be another factor 
ensuring their widespread use. Along with powerful machines of enormous 
capacity there will be the small-gauge machines that are simple to use 
and are within the purchasing power of any scientific or planning institute 
or of a factory; in addition to the universal ones, there are simpler spécial 
machines, intended for some spécifie range of problems; besides the 
purely digital machines other types hâve been invented, which accept 
data from certain devices, perform digital calculations on them, and then 
give out the results again continuously in the form of curves or of values 
of parameters controlling various units of the machine. 

2. Better programming. A second path to new effectiveness in the use 
of these machines is further improvement in methods of programming. 
The construction of programs in the usual manner, described in §2, is 
easy for comparatively simple mathematical problems; in actual problems 
of any magnitude, it involves very complicated and detailed labor. This 
work may be lightened to a certain extern by the use of a “library” of 
standard subroutines, set up permanently for the calculation of basic 
functions and for performing certain necessary mathematical operations, 
such as inversion of a matrix or numerical intégration. In spite of this, 
the fitting of subroutines into the basic program, addressing and re- 
addressing the results, and testing and rearranging the program is a 
quite complicated and detailed task calling for definite skill. This fact 
may essentially delay the setting up of new problems for electronic 
machines. 

There are two possibilities for further development in this direction. 
One of them consists of constructing the program automatically by using 
the machine itself for this purpose, i.e., by converting the basic formulas 
and the logical structure of the problem, placed in the machine in coded 
form, into the desired program through the operation of the machine 
in accordance with a spécial “programming program.” 
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The second direction consists of having the machine operate on a 
certain spécial universal program, which immediately examines and 
performs the operations in accordance with a general plan of computation 
introduced into the machine; this general plan would contain a number 
of important problems (for example, the solution of a System of équations) 
and, without setting up the detailed working of the program, would 
.guarantee that the correct results were worked out and assigned to each 
particular problem. 

3. More intellectual tasks. Further progress in the application of 
computing machines in mathematics is connected with the use of the 
machines for the performance not only of numerical calculations but 
also of the standard calculations of analysis. 

Basically such a possibility is, in well-known cases, altogether prac- 
ticable. For example, if we describe a polynomial by its set of coefficients, 
then such operations as multiplication and division of polynomials 
consist of arithmetic operations on sequences of coefficients, which are 
easily programmed on machines. By the use of spécifie coding in describing 
a function, it is completely possible to construct a program which gives 
the dérivative of an elementary function (described in the same code), 
i.e., which allows one to perform the analytic process of différentiation. 
Ail these facts ensure the possibility in the future of solving problems by 
a spécifie method (for example, of solving a System of differential équations 
by means of power sériés), with complété carrying out of ail the analytic 
and numerical calculations. In this manner, computing machines may be 
used for performing quite subtle and typically intellectual tasks (but only 
of a standard character), just as the présent machines of the everyday 
world hâve replaced the physical labor not only of the stevedore but 
also of the seamstress. 

The influence of high-speed machines on numerical and approximative 
methods. The means and instruments used in any task naturally influence the 
methods of the work itself. For example, trigonométrie formulas computed 
by using logarithms are unsuitable for use on computing machines, on 
which only multiplication and division can be carried out directly. The 
use of a desk machine calls for entirely different computational schemes 
in approximation methods (for example, nondifference schemes in dif¬ 
ferential équations). 

The fundamental changes in computational instruments and the 
possibilities that hâve been opened up by the use of electronic computing 
machines hâve naturally brought about a change of attitude not only 
toward the methods of computational analysis but also, to a great extern, 
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toward the problems of mathematics in general and their applications. 

Let us consider a few questions where the changes are most évident. 

Mathematical tables and other ways oj' introducing functions into the 
computation. First of ail, electronic machines made a fundamental 
change in our powers of computing tables. In place of a single table 
of functions, we witness an annual output of hundreds of tables, including 
complété and exact tables for ail the basic spécial functions, not only 
for one but for several variables. But at the same time an essential change 
must be made in the structure of the tables. For use in high-speed machines, 
compact tables are appropriate, containing widely spread basic values 
and designed for interpolation of a high order. 

In many cases, in place of tables, it is convenient to use other methods 
of introducing functions into the machines, namely polynomials of best 
approximation over subintervals, expansions in continued fractions, 
approximating formulas based on numerical calculation of an intégral 
which represents the function, and so forth; ail of these may profitably 
be introduced, in various cases, into the program of computation of a 
given function. 

Spécial functions and partial analytic solutions. The spécial functions 
themselves and the solutions of problems in finite analytic form still 
retain their significance for qualitative investigation of a problem and for 
clearing up the character of its singularises, both of which are important 
for a numerical solution. In certain iarge-scale problems, the use of such 
spécial functions may provide the most economical means of finding the 
solution numerically. Nevertheless, the construction in many particular 
cases of an exact or approximate solution, by means of complicated 
apparatus or of the spécial functions that were formerly introduced for 
greater ease of computation, has turned out to be a mistaken policy. 
For machine calculations it is much simpler and shorter to find the 
solution by general numerical methods without making use of any of the 
analytic représentations discussed earlier. 

Thus the very considérable efforts that hâve been made to put into 
complicated analytic form the solutions of various particular problems 
in technology and mechanics hâve in many cases turned out to be wasted. 

The choice of computational methods. It is incorrect to say that, because 
of the high productivity of electronic machines, there is no need to develop 
approximating methods further and that we may always use the most 
primitive methods. In reality, only for the simplest one-dimensional 
problems where, independently of the choice of method, the calculation 
will not run to more than a few thousand steps, can the solution be 
found on the machine in a few seconds or minutes. 

For the systematic solution of newer, more complicated problems the 
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number of steps may well amount to several hundred million, so that a 
proper choice of methods to decrease this number is quite essential. 
Consequently, it is a matter of great practical importance to work out 
effective methods of approximation, especially for multidimensional 
problems such as interpolation of functions of many variables, computa¬ 
tion of multiple intégrais, solution of Systems of nonlinear algebraic 
or transcendental équations, solution of three-dimensional intégral 
équations, Systems of partial differential équations, and so forth. 

At the same time there has been a considérable change in our attitude 
of mind in estimating the value of approximative methods; they must be 
judged by the ease with which they can be carried out on the machine 
or by their universality, that is, by the extern of their applicability to 
massive problems. Methods lose a great deal in value if they dépend on 
spécial peculiarities of the problem or on the skill of the person who is 
directing the computation 

The greatest value must be attached to universal methods that apply 
to a wide range of problems: différence methods, variational methods, 
the gradient method, itérative methods, linearization, and so forth. 

Of course, in choosing a computational method and the manner of 
carrying it out, one must remember that the method is in fact carried 
out on the machine, so that in some cases one ought to take into account 
the peculiarities of construction of the given machine. In particular, one 
must consider maximal use of the operative memory, minimization of the 
data introduced from outside, the possibility of introducing intermediate 
checks, and the convenience of programming the problem. 

But one must not think that the machine can carry out only the simplest 
methods, based on one kind of operation. The wide possibilities in 
programming and the latest improvements in its methods allow us to 
carry out very complicated computational programs with many different 
branches, so that we can change the course of the computation according 
to the results obtained, which is hard to do even with hand computations. 
The only essential requirement is that ail these possibilities be completely 
provided for in advance. 

Also one must not think that no methods can be carried out which 
require algebraic operations. As mentioned above, it is also completely 
possible to carry out some of the operations of analysis. 

Significance of the estimâtes of error. In estimâtes of error for approxi¬ 
mation methods, greater significance must be attached to those of an 
asymptotic character, since large values of n (for example, the number 
of équations replacing an intégral équation by an algebraic System), 
small steps in différence methods, and so forth, are fully realizable on 
high-speed machines. In any comparison of the value of various approxi- 
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mative methods, primary considération must be given to asymptotic 
estimâtes describing the rapidity of convergence of the method. 

To increase the usefulness of machine methods, greater attention must 
be paid to a posteriori estimâtes of the error; that is, estimâtes made on 
the basis of the solution already computed. Such estimâtes may be in- 
cluded in the program and will then help to détermine the future course 
of the computation. For example, if it is seen that the error is unacceptably 
large, the computation may be automatically repeated with steps decreased 
by half. In this connection a posteriori estimâtes may turn out to be 
more convenient and practical than a priori ones, which are inevitably 
too high and considerably more complicated. 

The possibility of theoretical analysis of the problem. There is still 
another possible use for the information obtained in the numerical 
solution of a problem. In fact, by applying the methods of functional 
analysis to the approximation obtained, we may judge the existence and 
uniqueness of the solution, and also establish the range of the solution. 
Since the investigation of such questions by purely theoretical methods 
is sometimes extremely complicated and lengthy, and in many cases 
altogether impracticable, the possibility of making use for this purpose 
of numerical calculations produced on the machine is undoubtedly of 
interest. 

New problems in numerical methods. The sharp increase in compu- 
tational possibilités and the accumulation of skill in their use has given 
rise to an entirely new range of problems in the investigation of numerical 
methods. Instead of being used in isolated cases as in the past, the solution 
of Systems of linear équations with a large number of unknowns has now 
become established as a fixed element in the solution of mathematical 
problems. This fact has given great practical importance to the following 
question: How important for the accuracy of our détermination of the 
unknowns is the influence of rounding off, not only of the coefficients but 
also of various processes in the course of the solution? This question 
has led to a sériés of interesting investigations. 

The possibility of numerical intégration on the machine of a System 
of differential équations over a large interval with small steps has given 
acute importance to the question of stability of the process of numerical 
intégration. Experimental analysis of this question and subséquent 
theoretical investigation hâve produced a considérable change in our 
estimâtes of the value of various methods of numerical intégration of 
differential équations. 

Questions of stability hâve primary significance also for the application 
of différence methods to partial differential équations. 

New methods. The possibility of using machines had led to the 
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appearance of completely new types of approximative and numerical 
methods or on the other hand has made it quite possible and convenient 
to employ the older methods in cases where up to now they had seemed 
completely impracticable. A characteristic example is the method of 
random sampling or, as it is often called, the “Monte-Carlo method.” 
This method consists of finding a probability problem whose solution 
(probability, mathematical expectation) is identical with the desired 
quantity. In this probability problem the solution is found experimentally, 
by random sampling, as the mean value in a sériés of experiments. For 
example, to find the area of a figure defined by the inequality F(x, y) s 0 
and contained in the square (0, 1; 0, 1), we make as long sequence as 
we like of random choices of pairs of numbers (x, y) contained in this 
square and then détermine what fraction of these pairs satisfy the given 
inequality. Of course, such a method will be very ineffective if the trials 
are made by hand, but if they are done on a machine, then it is fully 
practicable. The trials themselves may be carried out by means of a 
table of random numbers. For certain problems, e.g., for calculating a 
multiple intégral without great exactness, such a method may even be 
more effective than any other. 

A similar method may also be used for the problem of inverting a 
matrix, if we apply it to samples forming a Markov chain, and also for 
the solution of partial differential équations, if we hâve found a stochastic 
(probabilistic) process connected with it. 

The significance of high-speed machines for mathematical analysis, 
mechanics, and physics. In mathematical analysis great interest and 
practical importance is attached to investigations of multidimensional 
problems leading to the intégral équations and boundary-value problems 
of mathematical physics. These investigations and the resulting methods 
of solution are no longer impracticable but will now be put into effect 
as a resuit of the new computing techniques, especially since the solution 
of such problems is of urgent importance at the présent time. 

Of course, the value of these newly developed methods must be judged 
by the ease with which it is possible to put them into practice. 

On the other hand the possibility, thanks to machines, of carrying out 
with sufficient exactness a computation involving a large number of 
trials has led to an enormous extension in the range of application of 
“mathematical experiments” for the preliminary investigation of a 
mathematical problem and to a great increase in their effectiveness. This 
fact has made it important to work out applications of this Monte-Carlo 
method not only in general but also for particular problems; for example, 
the qualitative investigation of differential équations. 



372 


XIV. ELECTRONIC COMPUTING MACHINES 


It is interesting also to note that the machines may be used in problems 
of analysis not only in applications but also for purely theoretical ques¬ 
tions. Thus machine computation may prove necessary to increase the 
accuracy of the constants in certain inequalities and estimâtes in functional 
analysis; applications of this sort occur not only in analysis but also in 
the theory of numbers. 

Finally, machines may be used for testing the correctness of formulas 
of mathematical logic, and since many mathematical propositions and 
proofs can be written by means of the symbols of mathematical logic, 
it becomes theoretically possible to test on high-speed machines the 
logical correctness of certain mathematical déductions. 

As for mechanics and physics, we must first of ail emphasize the vast 
increase in the application of mathematics in these sciences. Up to the 
présent time the application of mathematics to concrète problems of 
mathematical physics was restricted by the enormous volume and com- 
plicated character of the necessary computations. In the problems arising 
in actual practice, this volume was usually such that the computation 
for one problem required several months and in some cases even several 
years of computational work. Thus, in spite of the fact that general 
mathematical formulations of many problems were known in mechanics 
and theoretical physics, and methods of their solution had been worked 
out in theory, in actual fact mathematical solutions, exact or numerical, 
had been obtained only for a few idealized and highly simplified cases, 
such as plane or axially symmetric problems, especially simple boundaries, 
or an airplane wing of infinité length. 

As a resuit the mathematical solutions were used not so much for 
finding the necessary numerical values as for a qualitative and tentative 
investigation of the problem, which in practice had to be supplemented 
by costly experiments. 

On the other hand, the application of new computing methods opens 
up the possibility of large-scale solutions of problems of mechanics and 
physics with ail their actual complications (space problems, problems 
with complicated boundary contours, and nonlinear partial difîerential 
équations). 

Of course, the actual carrying out of this possibility requires further 
development of the methods of numerical analysis and of machine 
solution for these problems. However, the practicability of treating such 
problems in this way has been strikingly demonstrated by successful 
expérience with solution on high-speed machines of Systems of partial 
difîerential équations in meteorology, in gas dynamics, in the équations 
of friable materials, and in other questions. 

The application of theoretical mathematical analysis to problems of 
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mechanics and physics with a close approximation to the actual physical 
problems and the increase in rapidity and flexibility resulting from the 
use of high-speed machines has made it possible in many cases to replace 
physical experiments by mathematical ones. This possibility will lead to 
further improvement in the methods of investigating problems in physics 
and mechanics and will increase the rôle played in them by theoretical 
and computational methods. 

The significance of electronic machines for technology and industry. 

The rapidity and effectiveness of numerical solutions of problems of 
mathematical analysis also allow us to make much greater use in the 
various branches of technology (structural mechanics, electrical engi¬ 
neering and radiotechnology, the exploitation of water power, and so 
forth) of theoretical methods and consequently to produce much more 
accurate and practical results. It is now possible to apply mathematical 
analysis to many technical problems where it has not been used before. 

In addition to the numerical solution of problems of mathematical 
analysis encountered in technology, a completely different application of 
mathematical machines to technology has been discovered. It will be 
possible to apply mathematical machines, for example in technical 
planning, to the choice of various possibilités for the construction or 
distribution of various objects. In questions of the organization of an 
industry many solutions are possible to the problem of distributing the 
various tasks and determining their proper sequence. The choice of the 
best, the most productive, and the most economical solution présents 
great difficulty. Here also one may find applications of machines; if it is 
possible to program a systematic examination of various solutions that 
takes account of the features of interest to us, then with the help of the 
machines we may compare several hundred thousand variants, which 
would be impossible by usual methods. 

In particular, a sériés of relay-contact circuits allows us to analyze and 
verify these solutions by the methods of mathematical logic, which may be 
carried out on high-speed machines. In this way it is possible to select 
a set of such variants on the basis of any desired criteria and then to 
choose the best one among this selected set. 

Of great promise is the use of machines in the automatic control of 
industry, if such machines are used in conjuction with servomechanisms 
and transmission devices. For example, if géométrie data concerning a 
manufactured article are introduced into the machine, together with a 
spécifie program for the purpose, it will détermine and transmit parameters 
that will govern the motion of a power press and make necessary changes 
in the article. Because of its high speed, the same electronic machine 
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may be used for simultaneous control of the work of several presses. 
It is also easy to see the significance of such machines for automatic 
guidance of moving objects, for example interplanetary rocket projectiles, 
since the guidance program can take into account not only the data 
originally introduced but also the changes in position indicated by various 
recording devices. 

In this way, the construction and analysis of computing machines and 
the possibilités of their application présent a wide field of activity for 
mathematicians. The use of mathematical machines in the coming years 
will undoubtedly play a great rôle in the development of our technology 
and culture. 


Suggested Reading 

F. L. Alt, Electronic digital computers: their use in science and engineering. 

Academie Press, New York, 1958. 

A. D. Booth and K. H. V. Booth, Automatic digital calculators. Academie Press, 
New York, 1953. 

P. von Handel (Editor), Electronic computers: fundamentals, Systems and appli¬ 
cations, Prentice-Hall, Englewood Cliffs, N. J., 1961. 

G. N. Lance, Numerical methods for high-speed computers. Biffe and Sons, 

London, 1960. 

D. D. McCracken, Digital computer programming, Wiley, New York, 1957. 

J. J. Murray, Mathematical machines, 1, Digital computers, 11, Analog devices, 
Columbia University Press, New York, 1961. 

J. von Neumann, The computer and the brain, Yale University Press, New Haven, 
Conn., 1958. 

R. K. Richards, Arithmetic operations in digital computers. Van Nostrand, New 
York, 1955. 

G. R. Stibitz and J. A. Larrivee, Mathematics and computers, McGraw-Hill, 
New York, 1957. 

M. V. Wilkes, Automatic digital computers, Wiley, New York, 1956. 



INDEX OF NAMES 


Abel, N. H., 143 
Aleksandrov, A. D., 114 
d’Alembert, J., 36 


Bernoulli, D„ 36, 269, 293, 297 
BcrnStcln, S. N., 246, 247, 262, 267, 
268, 281, 287, 290, 291 
Bertrand, J. L. F., 207 
Blaschke, W , 116 
Bogoljubov, N. N., 257 
Bonnet, P. O., 103 
Borel, E., 268 
Boyle, R., 14 


Caplygin, S. A., 160, 162 
Cauchy, A., 150, 153, 157, 162, 166, 
171-173, 176-180 

CebyScv, P. L., 60, 203, 206, 207, 208, 
211, 214, 223 , 244-247, 275, 281- 
287, 291, 299 
Ccch, E„ 116 
Clairaut, A. C., 60 
Codazzi, 61, 103, 108 
Cudakov, N. G., 209 


Darboux, G., 61, 101 
Dedekind, R., 203 
Descartes, R., 104 
Dirichlet, P. G. L., 53, 54 

Efimov, N. V., 102 
Eratosthenes, 204 


Euclid, 204, 207, 210 
Euler, L., 36, 60, 81, 83, 85, 91, 100, 
108, 124, 127-129, 131, 133, 146, 
147, 203, 205, 207, 208, 211, 220, 
269, 293, 305, 316,336-338 


Faber, G., 276 
Fejér, L., 298, 301 
Fermât, P., 225 
Finikov, S. P., 116 
Fokker, 261 

Fourier, J. B. J., 28, 36, 40, 268, 269, 
289, 291, 294, 296-298 
Frenet, J. F., 61 
Fresnel, A. J., 333 
Fubini, G., 116 


Galerkin, B. G., 40, 41, 53, 305, 306 
Galileo, 255, 256 

Gauss, K. F., 89, 90, 91, 103, 104, 108, 
203,206, 207, 281,282 
Gcl’fond, A. O., 203 
Gjunter, N. M., 52 
Goldbach, C„ 211, 217, 219, 224 
Grassmann, H., 104 


Hadamard, J., 208 

Hamilton, W. R., 50, 51, 52, 104, 132, 
133 

Hardy, G. H., 203 
Hinêin, A. Ja., 246 
l'Hôpital, 59 


375 



376 


INDEX O F N AMES 


Jackson, D., 268, 290 


KeldyS, M. V., 290 

Kirchhoff, G. R„ 328 

Kolmogorov, A. N„ 262, 268, 269, 297 

Kolosov, G. V., 163 

Korkin, A. N„ 267 

Kuiper, N. H., 115 

Ruminer, E. E., 203 


Lagrange, J. L„ 50, 203, 273, 281 
Laplace, P.-S., 15, 16, 17, 24, 28, 35, 
36, 44, 49, 163, 242, 258, 307, 327, 
328 

Laptcv, G. F., 116 
Lavrent'ev, M. A., 174, 268, 290 
Lebesgue, H., 52, 268, 297, 302 
Legendre, A. M., 206, 282 
Leibnitz, G. W„ 59, 110, 322 
Linnik, Ju. V., 210 
Littlewood, J. E., 203 
Ljapunov, A. M., 40, 247 
Ljusternik, L. A., 112 


Mariotte, E., 14 

Markov, A. A., 246, 247, 260, 261, 267, 
287, 371 

Markov, V. A., 267 

Mcndelcev, D. I„ 287 

Mcn'Sov, D. E., 269 

Mcrgcljan, S. N., 290 

Mcusnier, J. B. M., 83, 85, 86, 91, 108 

Minding, F., 60, 100, 111 

Minkowski, H., 113, 204 

de Moivre, A., 242 

Monge, G., 110, 111 

Morcland, 322 

Morse, M„ 112 

MushcliSvili, N. I., 163 


Nash, J., 115 

Newton, L, 10, 11, 15, 59, 147, 255, 
267, 273 


Odner, V. T., 322 

Ostrogradskiî, M. V., 8, 40, 132, 133, 
134 


Pascal, B., 322 

Peterson, K. M., 61, 103, 108 

Petrovskiï, L G., 262 

Picard, E„ 183 

Planck, M.K. E. L., 261 

Plateau, I. A. F., 88 

Pogorelov, A. V., 115 

Poincaré, H., 112, 208 

Poisson, S. D., 15, 16, 17, 28, 36, 40 


Ramanujan, S., 203 

Riemann, B., 150, 153, 157, 162, 166, 
169-173. 184, 191-193, 203, 208, 
209 

Ritz, W., 53, 54, 306 


Schrôdinger, E., 53 
Selberg, A., 208 

Simpson, T„ 276-281, 307, 314, 315 
Snirel'man, L. G., 112 
Sobolev, S. L„ 50, 53 
Stepanov, V. V., 36, 257 
Sylvester. J. J., 208 


Taylor, B., 273, 274 
Trefftz, J., 53 


de la Vallée-Poussin, C. J., 268 
Vinogradov, 1. M., 203, 204, 209-211, 
217,219,220, 224,269 
VoronoT, G. F., 204 


Weierstrass, K. T., 268, 288-290 
Wilczynski, E. J., 116 
Wilson, C. T. R., 87 


Zolotarev, E. I., 203, 267 
Zukovskiï, N. E., 159, 160, 162 



CONTENTS OF THE SERIES 


Chapter 

Chapter 


Chapter 

Chapter 

Chapter 


Chapter 

Chapter 

Chapter 


Chapter 

Chapter 

Chapter 


Volume 1 


PART 1 

I A GENERAL VIEW OF MATHEMAT1CS 

A. D. Aleksandrov 

II ANALYSIS M. A. Lavrent'ev and S. M. Nikol'skiï 
PART 2 

III ANALYT1C GEOMETRY B. N. Delone 

IV ALGEBRA: THEORY OF ALGEBRA1C EQUATIONS 

B. N. Delone 

V ORD1NARY D1FFERENT1AL EQUATIONS 

I. G. Pelrovskil 


Volume III 
PART 5 

XV THEORY OF FUNCTIONS OF A REAL VARIABLE 

S. B. Steikin 

XVI L1NEAR ALGEBRA D. K. Faddeev 

XVII NON-EUCLIDEAN GEOMETRY A. D. Aleksandrov 

PART 6 

XVIII TOPOLOGY P. S. Aleksandrov 

XIX FUNCTIONAL ANALYSIS I. M. Gel'fand 

XX GROUPS AND OTHER ALGEBRA1C SYSTEMS 

A. 1. Mal’cev 


377 



